Best way to interop C# System.String with C++ std::string&

  • Thread starter Thread starter Edson Manoel
  • Start date Start date
E

Edson Manoel

I have some C++ unmanaged code that takes std::string& arguments (as
reference), and fills them (possibly growing the string).

I want to call this code through PInvoke (DllImport), possibly using
wrapper layers in unmanaged C++ and C#.

I've thought about two approaches:

1) To pass a StringBuilder, this is converted to a char* in C++, the
wrapper code converts the char* to a std::string (copy), and in the
end, copies the std::string content back to the char*. The problem is
that the StringBuilder cannot grow in C++ code (or maybe it can? with
callbacks?), it must be initialized in C# with a size that is not
known beforehand; this also seems insecure.

2) To pass a "ref String", this is converted to a "char** arg". The
wrapper code converts *arg to a string, and in the end, should set
*arg to a new value. I don't know if the *arg pointer can really be
changed... if I create a new char[] block in the C++ heap and pass it
back (setting it to *arg), it results in a heap error (although the
string returned is correct).

All of these approaches involves copying the data (sometimes more than
twice), but, by now, I just want to make it work. Are there better
ways to accomplish this?

Thanks,
Edson
 
I have some C++ unmanaged code that takes std::string& arguments (as
reference), and fills them (possibly growing the string).

I want to call this code through PInvoke (DllImport), possibly using
wrapper layers in unmanaged C++ and C#.

I've thought about two approaches:

1) To pass a StringBuilder, this is converted to a char* in C++, the
wrapper code converts the char* to a std::string (copy), and in the
end, copies the std::string content back to the char*. The problem is
that the StringBuilder cannot grow in C++ code (or maybe it can? with
callbacks?), it must be initialized in C# with a size that is not
known beforehand; this also seems insecure.

2) To pass a "ref String", this is converted to a "char** arg". The
wrapper code converts *arg to a string, and in the end, should set
*arg to a new value. I don't know if the *arg pointer can really be
changed... if I create a new char[] block in the C++ heap and pass it
back (setting it to *arg), it results in a heap error (although the
string returned is correct).

All of these approaches involves copying the data (sometimes more than
twice), but, by now, I just want to make it work. Are there better
ways to accomplish this?

  Thanks,
Edson

Not sure if this would help, but here it is:

System::Void ManagedToBasicString(String^ iSourceStr,
std::string &oTargetStr)
{
oTargetStr = std::string((char*)
(void*)Marshal::StringToHGlobalAnsi(iSourceStr));
return;
}

Good luck:-)
 
I have some C++ unmanaged code that takes std::string& arguments (as
reference), and fills them (possibly growing the string).

I want to call this code through PInvoke (DllImport), possibly using
wrapper layers in unmanaged C++ and C#.

I've thought about two approaches:

1) To pass a StringBuilder, this is converted to a char* in C++, the
wrapper code converts the char* to a std::string (copy), and in the
end, copies the std::string content back to the char*. The problem is
that the StringBuilder cannot grow in C++ code (or maybe it can? with
callbacks?), it must be initialized in C# with a size that is not
known beforehand; this also seems insecure.

2) To pass a "ref String", this is converted to a "char** arg". The
wrapper code converts *arg to a string, and in the end, should set
*arg to a new value. I don't know if the *arg pointer can really be
changed... if I create a new char[] block in the C++ heap and pass it
back (setting it to *arg), it results in a heap error (although the
string returned is correct).

All of these approaches involves copying the data (sometimes more than
twice), but, by now, I just want to make it work. Are there better
ways to accomplish this?

  Thanks,
Edson

How about

- you write a mixed (managed/native) wrapper
- you pass a System::String by reference to the wrapper
- in the wrapper scope, you fetch the std::string from unmanaged code,
and assign to the referenced System::String?

I dont see why that wont work.
 
Edson... Is it possible that your code could take a managed string as an
in
parameter, convert the managed string to char*, call the C++ code, and
then
RETURN a new managed string on method exit? The following code was a
quick hack taking a std::string and returning a managed string. I
readily
admit I only dabble in STL/CLI.

using namespace System;
using namespace cliext; // STL/CLR
using namespace System::Runtime::InteropServices; // Marshal

class Util {
public:
/////////////////////////////////////////////////////////
// ** ConvertSS2MS ** //
// Converts std::string to managed String^ //
// Parameters constant std::string in by ref //
// Returns String^ out //
// Static Public Class method //
// Internally converts to most common denominator //
// char* on heap using new and delete //
// std::string ref in cannot be null, but may be empty //
// Returns empty string on exception //
// JAL 12/04/08 //
/////////////////////////////////////////////////////////

static String^ ConvertSS2MS(const std::string& in) {
String^ out= L""; // L wchar_t typedef unsigned short
char* str=0;
size_t size= strlen(in.c_str()) +1; // +1 for null terminator

try {
//const char *p= in.c_str(); // use p immediately
//out= Marshal::PtrToStringAnsi(static_cast<IntPtr>(p)); //
error on const char *
str= new char[size]; // null terminated
strcpy_s(str,size,in.c_str()); // create longer lived copy in
non const char array
// strcpy_s has new security enhancements over strcpy,
size includes null char
out= Marshal::PtrToStringAnsi(static_cast<IntPtr>(str)); //
or safe_cast?
}
catch(...) { // eat the exception
out= L""; // returns empty string on exception
}
finally { // clean up memory
if(str) {
delete [] str; // release char[] on unmanaged heap,
delete [] calls destructors
}
}
return out;
} // end ConvertSS2MS
}; // end class Util

Regards,
Jeff
 
All of these approaches involves copying the data (sometimes more than
twice), but, by now, I just want to make it work. Are there better
ways to accomplish this?

You always will have to copy the data because .NET strings are
internally encoded in Unicode UTF-16 which uses two bytes per
character, whereas C++ std::string uses an implementation-specific
ASCII superset with one byte per character. So there's no way to just
share a common buffer if you were thinking of that.
 
You always will have to copy the data because .NET strings are
internally encoded in Unicode UTF-16 which uses two bytes per
character, whereas C++ std::string uses an implementation-specific
ASCII superset with one byte per character. So there's no way to just
share a common buffer if you were thinking of that.
--http://www.kynosarges.de

Thanks for all the answers!

I solved it passing a ref string; this is marshaled to a char**.
Dereferencing it in C++, I get a char*, then I build a std::string
from it and pass it to my C++ method. After the call to the method, I
use CoTaskMemFree to free the original char*, and then I allocate a
new char* buffer with CoTaskMemAlloc, with the returned std::string
size().

So, in the end, to just pass one string and get it returned, there are
at least 4 copies being made (C# string to char* conversion; char* to
std::string / input and output), 3 allocations (char*, std::string and
my own CoTaskMemAlloc) and 3 deallocations. Well, at least, It Just
Works® :p. I wish that it were faster, but it seems that C# (and CLR
in general) weren't made to speak with *unmanaged* C++ code... (only
C).

I guess this does not leak memory, am I right?

I've learned that .Net uses CoTaskMemAlloc/CoTaskMemFree through some
websites (Msdn, etc.). I couldn't debug inside the Microsoft code that
does the marshaling (/interop)... is there any way to do this, or am I
not allowed to see the magic trickery that is going on besides the
curtain?

Thanks,
Edson
 
So, in the end, to just pass one string and get it returned, there are
at least 4 copies being made (C# string to char* conversion; char* to
std::string / input and output), 3 allocations (char*, std::string and
my own CoTaskMemAlloc) and 3 deallocations. Well, at least, It Just
Works® :p. I wish that it were faster, but it seems that C# (and CLR
in general) weren't made to speak with *unmanaged* C++ code... (only
C).

Well, this is only partly true. There is no support in the CLR for C++
interop other than what is needed to make the C++/CLI compiler work. You
are expected to use C++/CLI for tasks like this.

Here is how you would avoid the extra copies (still need Unicode conversion
of course):

void LetsNativeCodeOperateOnManagedString(System::String^% managedString)
{
int len = managedString->Length;
std::string nativeString;
nativeString.reserve(len + 1);
WideCharToMultiByte(..., PtrToStringChars(managedString), len,
&(nativeString[0]), len, ...);

callNativeFunction(nativeString);

managedString = gcnew String^(nativeString, nativeString.length());
}
 
Ben said:
So, in the end, to just pass one string and get it returned, there
are at least 4 copies being made (C# string to char* conversion;
char* to std::string / input and output), 3 allocations (char*,
std::string and my own CoTaskMemAlloc) and 3 deallocations. Well, at
least, It Just Works® :p. I wish that it were faster, but it seems
that C# (and CLR in general) weren't made to speak with *unmanaged*
C++ code... (only C).

Well, this is only partly true. There is no support in the CLR for
C++ interop other than what is needed to make the C++/CLI compiler
work. You are expected to use C++/CLI for tasks like this.

Here is how you would avoid the extra copies (still need Unicode
conversion of course):

void LetsNativeCodeOperateOnManagedString(System::String^%
managedString) {
int len = managedString->Length;
std::string nativeString;
nativeString.reserve(len + 1);
WideCharToMultiByte(..., PtrToStringChars(managedString), len,
&(nativeString[0]), len, ...);

callNativeFunction(nativeString);

managedString = gcnew String^(nativeString, nativeString.length());

oops, should be gcnew String, not gcnew String^
 
Ben said:
Well, this is only partly true. There is no support in the CLR for
C++ interop other than what is needed to make the C++/CLI compiler
work. You are expected to use C++/CLI for tasks like this.

Here is how you would avoid the extra copies (still need Unicode
conversion of course):

Of course 2008 gives you all of this in the Microsoft-provided marshal_as
function.
void LetsNativeCodeOperateOnManagedString(System::String^%
managedString) {
int len = managedString->Length;
std::string nativeString;
nativeString.reserve(len + 1);
WideCharToMultiByte(..., PtrToStringChars(managedString), len,
&(nativeString[0]), len, ...);

callNativeFunction(nativeString);

managedString = gcnew String^(nativeString, nativeString.length());
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top