Exposing bool to non-MS, non-C++ callers

B

Bob Altman

Hi all,

Suppose I export a function or a structure from a standard DLL that uses the
C++ "bool" data type. The documentation states that, as of C++ 5.0 and
later, Visual C++ implements bool as a 1 byte data type. If someone puts a
value other than 1 or 0 into the byte and calls into my code, what happens?
Am I guaranteed that C++ will interpret a bool as "zero is false and
non-zero is true"? If the caller supplies the value 3 for my bool variable,
will "myVar = false" return true or false?

TIA - Bob
 
D

Doug Semler

Bob Altman said:
Hi all,

Suppose I export a function or a structure from a standard DLL that uses
the C++ "bool" data type. The documentation states that, as of C++ 5.0
and later, Visual C++ implements bool as a 1 byte data type. If someone
puts a value other than 1 or 0 into the byte and calls into my code, what
happens? Am I guaranteed that C++ will interpret a bool as "zero is false
and non-zero is true"? If the caller supplies the value 3 for my bool
variable, will "myVar = false" return true or false?


I presume you mean (myVar == false) since myVar = false is an assignment <g>

Zero is false. Non zero is true. This is a defined by the C++ langauge
spec (AFAIK). It doesn't matter that VC++ defines bool to 1 byte...it could
be 8 bytes for all that it matters....the language defines it.

On the other hand, you have to ask yourself why the caller is sending you a
"3" when you have declared your interface function to accept only a bool.
The 'bool' has a specific (compiler defined) data type that should only
accept a 'true' or 'false' (also compiler defined values) in the C++ world.
In a "perfect" world, a bool datatype would be 1 bit (of course it's
not...in fact it's probably 32 or 64 bits depending on the computer's
architecture).
--
Doug Semler, MCPD
a.a. #705, BAAWA. EAC Guardian of the Horn of the IPU (pbuhh).
The answer is 42; DNRC o-
Gur Hfrarg unf orpbzr fb shyy bs penc gurfr qnlf, abbar rira
erpbtavmrf fvzcyr guvatf yvxr ebg13 nalzber. Fnq, vfa'g vg?
 
D

Doug Harrison [MVP]

It would be undefined. I'd guess that used in conditionals, it would be OK,
but I really don't know. If this is a C++ API for use by C++ code, I
wouldn't worry about it. Otherwise, I'd use the Windows BOOL instead of the
C++ bool.
Zero is false. Non zero is true. This is a defined by the C++ langauge
spec (AFAIK). It doesn't matter that VC++ defines bool to 1 byte...it could
be 8 bytes for all that it matters....the language defines it.

But the bool type can only hold two values, true and false.
On the other hand, you have to ask yourself why the caller is sending you a
"3" when you have declared your interface function to accept only a bool.
The 'bool' has a specific (compiler defined) data type that should only
accept a 'true' or 'false' (also compiler defined values) in the C++ world.

Agreed.
 
D

Doug Semler

Doug Harrison said:
It would be undefined. I'd guess that used in conditionals, it would be
OK,
but I really don't know. If this is a C++ API for use by C++ code, I
wouldn't worry about it. Otherwise, I'd use the Windows BOOL instead of
the
C++ bool.

I'll go dig out my ISO C++ spec and find out if really necessary, but i have
a feeling you're right as far as that goes, since in C++ bool is a type,
which has two values, true and false....if an implementation decided it
wanted to implement 'true' as 0x12345678 and 'false' as 0x87654321 AFAIK, it
is completely free to do so.

OTOH, the Windows BOOL type is actually a DWORD, with TRUE (generally) being
0xFFFFFFFF.... and FALSE always being 0...
But the bool type can only hold two values, true and false.

(see above)




--
Doug Semler, MCPD
a.a. #705, BAAWA. EAC Guardian of the Horn of the IPU (pbuhh).
The answer is 42; DNRC o-
Gur Hfrarg unf orpbzr fb shyy bs penc gurfr qnlf, abbar rira
erpbtavmrf fvzcyr guvatf yvxr ebg13 nalzber. Fnq, vfa'g vg?
 
D

Doug Harrison [MVP]

I'll go dig out my ISO C++ spec and find out if really necessary, but i have
a feeling you're right as far as that goes, since in C++ bool is a type,
which has two values, true and false....if an implementation decided it
wanted to implement 'true' as 0x12345678 and 'false' as 0x87654321 AFAIK, it
is completely free to do so.

That wouldn't be likely, since true and false convert to 1 and 0,
respectively, when used in arithmetic expressions. A typical approach is to
normalize non-bool values to true and false when converting to bool, with
true = 1 and false = 0, and then nothing has to be done to go the other
way.
OTOH, the Windows BOOL type is actually a DWORD, with TRUE (generally) being
0xFFFFFFFF.... and FALSE always being 0...

In Windows, BOOL is a typedef for int, and TRUE is #defined as 1 and FALSE
as 0. Of course, you need to treat non-zero as true and in particular never
compare a BOOL against TRUE.
 
D

Doug Semler

Doug Harrison said:
That wouldn't be likely, since true and false convert to 1 and 0,
respectively, when used in arithmetic expressions. A typical approach is
to
normalize non-bool values to true and false when converting to bool, with
true = 1 and false = 0, and then nothing has to be done to go the other
way.

Well, of course, it would be idiotic to implement a compiler that way.
However, AFAIK, there's nothing in the C++ spec that says I can't implement
it that way as long as 'false' ultimately converts to '0' and true converts
to '1' in emitted assembly. It doesn't detract from the point that relying
on specific compiler conversions is bad practice (i.e. expecting '3' to
translate to 'true' - it may work on one compiler but not on another)

Dammit now you're going to make me go down to the basement tomorrow and
drag out my copy of the ISO spec <g>

But let's not get away from the orgininal post:

the function was declared as expecting a 'bool' parameter. If the caller is
casting a '3' and passing that, the caller is doing something wrong. The
function is expecting a bool, which is a defined type in C++, with two valid
values, true, and false, which are COMPILER defined. Not NECESSARILY 3.
The callee shouldn't have to worry about that...any bugs introduced by
passing the value '3' to the function are COMPLETELY the fault of the caller
not adhering to the function specification declared by the library,
regardless of how the compiler converts ints to bool....
In Windows, BOOL is a typedef for int, and TRUE is #defined as 1 and FALSE
as 0. Of course, you need to treat non-zero as true and in particular
never
compare a BOOL against TRUE.

Damn, I coulda sworn BOOL was a DWORD <shrug>. Either way, I have NEVER
compared a BOOL result of a WinAPI function to TRUE, because most of them
have been documented to be

BOOL FunctionName()
....

On success returns non zero, return zero on failure...yadayada GetLastError
yadayada

--
Doug Semler, MCPD
a.a. #705, BAAWA. EAC Guardian of the Horn of the IPU (pbuhh).
The answer is 42; DNRC o-
Gur Hfrarg unf orpbzr fb shyy bs penc gurfr qnlf, abbar rira
erpbtavmrf fvzcyr guvatf yvxr ebg13 nalzber. Fnq, vfa'g vg?
 
D

Doug Semler

Bob Altman said:
Hi all,

Suppose I export a function or a structure from a standard DLL that uses
the C++ "bool" data type. The documentation states that, as of C++ 5.0
and later, Visual C++ implements bool as a 1 byte data type. If someone
puts a value other than 1 or 0 into the byte and calls into my code, what
happens? Am I guaranteed that C++ will interpret a bool as "zero is false
and non-zero is true"? If the caller supplies the value 3 for my bool
variable, will "myVar = false" return true or false?


After rereading your post, and thinking about it some more:

Are you REALLY sure you want to expose C++ data structures/functions from a
DLL? You are pretty much limiting the users of your library to the specific
version of the compiler used to generate the library, since name mangling
has changed (at least in my experience) in every version since 6.0...

Better to extern "C" everything and use standard C types in all data
structures <G>

--
Doug Semler, MCPD
a.a. #705, BAAWA. EAC Guardian of the Horn of the IPU (pbuhh).
The answer is 42; DNRC o-
Gur Hfrarg unf orpbzr fb shyy bs penc gurfr qnlf, abbar rira
erpbtavmrf fvzcyr guvatf yvxr ebg13 nalzber. Fnq, vfa'g vg?
 
J

Jeffrey Tan[MSFT]

Hi Bob,

This is an interesting question.

In language level, the C++ bool type is defined as one byte. This can be
easily confirmed with:
printf("Size of bool: %d\n", sizeof(bool));

So, once the VC++ compiler sees bool type, it will cut any value greater
than a byte into single byte. Let's take the following sample code as an
example:
void Test(bool test)
{
printf("Size of bool: %d\n", sizeof(bool));
}

int _tmain(int argc, _TCHAR* argv[])
{
int i = 10000000;
Test(i);
}

The VC++ compiler will emit the assembly code below:
cmp dword ptr ,0
setne al
push eax
call Test (41122Bh)
add esp,4

As you can see, the VC++ compiler will use compare the input value with
zero and only set the single byte "al" register for using with the bool
type. This obeys the single byte rule.

Let's come back to your specific problem. If the code goes across the DLL
boundary, there is another story. The answer for your question is not
consistent. Let's explain.

When your DLL project is compiled into a DLL binary file, all C++ type
information is lost. There is only binary code in the DLL. Since the x86
platform Windows requires the parameters to be passed in 4 bytes aligned.
Even your function parameter is of type "bool", the compiler will reserve a
4 bytes space for this parameter. Now, in the Exe project, if the caller
pushes a value larger than 1 byte and less than 4 byte, Windows will accept
it without any problem, no crash or corruption will occur. Sure, this
depends on whether the Exe caller will push a value larger than 1 byte. If
the caller also declares the parameter as a C++ bool type(assume the caller
is also a C++ language), I think the caller compiler will also use some
type of trick(like the assembly code above used by VC++ compiler) to only
pass a single byte value to your DLL exported function.

Note: if you wrap the bool type in a C++ structure, union or class, the
compiler may not align the bool type at 4 bytes by using "#pragma pack(1)".

Yes, I agree that we'd better use Windows BOOL type for export so that we
will not confuse with this problem.

Hope this helps.

Best regards,
Jeffrey Tan
Microsoft Online Community Support
==================================================
Get notification to my posts through email? Please refer to
http://msdn.microsoft.com/subscriptions/managednewsgroups/default.aspx#notif
ications.

Note: The MSDN Managed Newsgroup support offering is for non-urgent issues
where an initial response from the community or a Microsoft Support
Engineer within 1 business day is acceptable. Please note that each follow
up response may take approximately 2 business days as the support
professional working with you may need further investigation to reach the
most efficient resolution. The offering is not appropriate for situations
that require urgent, real-time or phone-based interactions or complex
project analysis and dump analysis issues. Issues of this nature are best
handled working with a dedicated Microsoft Support Engineer by contacting
Microsoft Customer Support Services (CSS) at
http://msdn.microsoft.com/subscriptions/support/default.aspx.
==================================================
This posting is provided "AS IS" with no warranties, and confers no rights.
 
B

Bob Altman

I guess my original posting could have been a little more complete (at the
risk of being longer and harder to digest). I have a standard DLL that is
called from FORTRAN and Ada clients (as well as C/C++ and VB.Net). Some of
the functions in this DLL accept a structure as an argument. That structure
contains some Boolean (true or false) data.

I initially implemented things in a completely "C-friendly" way, by defining
the structure as containing "char" (a.k.a. "byte") data, with the
specification that the data must be either zero for false or non-zero for
true. FORTRAN callers like to plunk -1 (all bits set) into the structure
for "true". C callers tend to use 1 for "true". Some callers use built-in
conversions or functions to cast their concept of "true" and "false" to a
byte data type, and they are often not consistent from operation to
operation, so I may get a combination of 1 and -1 values for "true".

In the code that accesses the structure, it's tempting to use the bitwise
logical operators (&&, ||, !) on the byte data, but that scheme fails
miserably if TRUE isn't consistently represented. If I want to operate
directly on the char data in the structure, I need to be super careful to
always cast the char data to bool on the fly (I use a macro: #define
tobool(x) (x!=0)). The compiler doesn't help catch places where I neglect
to do this. It just performs a bitwise logical operation and gives me a
numeric result.

So, I thought, why not define the members of the structure as real C++
"bool" values? The callers would still be obligated to place byte data into
the structure, but the compiler might be nice enough to create code that
consistently interpreted the bitwise logical operators as operating on
zero/non-zero data rather than as operating on 8-bit numeric data.

- Bob


"Jeffrey Tan[MSFT]" said:
Hi Bob,

This is an interesting question.

In language level, the C++ bool type is defined as one byte. This can be
easily confirmed with:
printf("Size of bool: %d\n", sizeof(bool));

So, once the VC++ compiler sees bool type, it will cut any value greater
than a byte into single byte. Let's take the following sample code as an
example:
void Test(bool test)
{
printf("Size of bool: %d\n", sizeof(bool));
}

int _tmain(int argc, _TCHAR* argv[])
{
int i = 10000000;
Test(i);
}

The VC++ compiler will emit the assembly code below:
cmp dword ptr ,0
setne al
push eax
call Test (41122Bh)
add esp,4

As you can see, the VC++ compiler will use compare the input value with
zero and only set the single byte "al" register for using with the bool
type. This obeys the single byte rule.

Let's come back to your specific problem. If the code goes across the DLL
boundary, there is another story. The answer for your question is not
consistent. Let's explain.

When your DLL project is compiled into a DLL binary file, all C++ type
information is lost. There is only binary code in the DLL. Since the x86
platform Windows requires the parameters to be passed in 4 bytes aligned.
Even your function parameter is of type "bool", the compiler will reserve
a
4 bytes space for this parameter. Now, in the Exe project, if the caller
pushes a value larger than 1 byte and less than 4 byte, Windows will
accept
it without any problem, no crash or corruption will occur. Sure, this
depends on whether the Exe caller will push a value larger than 1 byte. If
the caller also declares the parameter as a C++ bool type(assume the
caller
is also a C++ language), I think the caller compiler will also use some
type of trick(like the assembly code above used by VC++ compiler) to only
pass a single byte value to your DLL exported function.

Note: if you wrap the bool type in a C++ structure, union or class, the
compiler may not align the bool type at 4 bytes by using "#pragma
pack(1)".

Yes, I agree that we'd better use Windows BOOL type for export so that we
will not confuse with this problem.

Hope this helps.

Best regards,
Jeffrey Tan
Microsoft Online Community Support
==================================================
Get notification to my posts through email? Please refer to
http://msdn.microsoft.com/subscriptions/managednewsgroups/default.aspx#notif
ications.

Note: The MSDN Managed Newsgroup support offering is for non-urgent issues
where an initial response from the community or a Microsoft Support
Engineer within 1 business day is acceptable. Please note that each follow
up response may take approximately 2 business days as the support
professional working with you may need further investigation to reach the
most efficient resolution. The offering is not appropriate for situations
that require urgent, real-time or phone-based interactions or complex
project analysis and dump analysis issues. Issues of this nature are best
handled working with a dedicated Microsoft Support Engineer by contacting
Microsoft Customer Support Services (CSS) at
http://msdn.microsoft.com/subscriptions/support/default.aspx.
==================================================
This posting is provided "AS IS" with no warranties, and confers no
rights.
 
D

Doug Semler

I guess my original posting could have been a little more complete (at the
risk of being longer and harder to digest). I have a standard DLL that is
called from FORTRAN and Ada clients (as well as C/C++ and VB.Net). Some of
the functions in this DLL accept a structure as an argument. That structure
contains some Boolean (true or false) data.

I initially implemented things in a completely "C-friendly" way, by defining
the structure as containing "char" (a.k.a. "byte") data, with the
specification that the data must be either zero for false or non-zero for
true. FORTRAN callers like to plunk -1 (all bits set) into the structure
for "true". C callers tend to use 1 for "true". Some callers use built-in
conversions or functions to cast their concept of "true" and "false" to a
byte data type, and they are often not consistent from operation to
operation, so I may get a combination of 1 and -1 values for "true".

In the code that accesses the structure, it's tempting to use the bitwise
logical operators (&&, ||, !) on the byte data, but that scheme fails
miserably if TRUE isn't consistently represented. If I want to operate
directly on the char data in the structure, I need to be super careful to
always cast the char data to bool on the fly (I use a macro: #define
tobool(x) (x!=0)). The compiler doesn't help catch places where I neglect
to do this. It just performs a bitwise logical operation and gives me a
numeric result.

Hold on a sec, you aren't using bitwise operators, you are using
logical operators. You should have no problems as long as you have
documented that FALSE is 0 in all cases and true is non zero. You
shouldn't be doing bitwise operations on these values (since they are
being treated as bools in your code). But that's usually a debugging
problem <g>

typedef struct mystruct
{
char boolOne;
char boolTwo;
} x;

void function(x* myX)
{
// OK - doesn't matter - in C's logical operator world, 0 is
false, non zero is true.
if (x->boolOne || x->boolTwo)
{
}

// Your (not caller) bug - bitwise not good unless all callers
consistent (well, it's ok for the | but not the &)
if (x->boolOne | x->boolTwo)
{
}
}
 
B

Bob Altman

Oh!!!! (What's the emoticon for "embarrassed"?) I didn't realize that "!",
"&&" and "||" were logical operators that implicitly convert their arguments
to bool and return a bool result (or so claims the on-line docs). So I'm
fine using char data types in my public structures and interfaces. I just
have to be careful to avoid the bitwise logical operators ("&" and "|"),
which the compiler will happily implement for me without warning.
(Apparently there is no bitwise "not" operator--or at least I can't find it
in the MSDN docs.)

Thanks!!!

- Bob
 
B

Ben Voigt [C++ MVP]

Bob Altman said:
Oh!!!! (What's the emoticon for "embarrassed"?) I didn't realize that
"!", "&&" and "||" were logical operators that implicitly convert their
arguments to bool and return a bool result (or so claims the on-line
docs). So I'm fine using char data types in my public structures and
interfaces. I just have to be careful to avoid the bitwise logical
operators ("&" and "|"), which the compiler will happily implement for me
without warning. (Apparently there is no bitwise "not" operator--or at
least I can't find it in the MSDN docs.)

bitwise complement is '~'.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top