Identical binaries from same source code

T

tararot2

Hi,
I have to homologate my binaries so I need them to be byte-per-byte
identical after each compilation (if the source code is the same,
logically). Does anybody know how to achieve that with either compiler
options or modifying some parts of the binaries?
After some research I've succesfully patched:
- timestamp: in 0x88 address, I have set to zero.
- GUID after <PrivateImplementationDetails>: find this text in file
and set GUID to zero.
but there is an unknown data block after what I supose to be the
strings declarations and I don't know what to do with. In some cases,
setting it to 0's works but in other application hangs. I cannot find
a generic way to clean it.
I need to do that automatically for generic binaries of .NET.

Thanks in advance.
 
C

Cowboy \(Gregory A. Beamer\)

I do not know everything that might be slightly different. I know that
culture changes can affect things, as can the build numbers.

I am not sure why you need this, however, as you can sign an assembly to
show it has not been tampered with. That may not be your reason, however, so
I may be being a bit presumptious. :)

--
Gregory A. Beamer
MVP, MCP: +I, SE, SD, DBA

*************************************************
| Think outside the box!
|
*************************************************
 
L

Lasse Vågsæther Karlsen

tararot2 said:
Hi,
I have to homologate my binaries so I need them to be byte-per-byte
identical after each compilation (if the source code is the same,
logically). Does anybody know how to achieve that with either compiler
options or modifying some parts of the binaries?
After some research I've succesfully patched:
- timestamp: in 0x88 address, I have set to zero.
- GUID after <PrivateImplementationDetails>: find this text in file
and set GUID to zero.
but there is an unknown data block after what I supose to be the
strings declarations and I don't know what to do with. In some cases,
setting it to 0's works but in other application hangs. I cannot find
a generic way to clean it.
I need to do that automatically for generic binaries of .NET.

Thanks in advance.

If the source code doesn't change, and you want the same binary as you
originally built, surely it would be easier to just keep a copy of the
original binary so that you don't have to rebuild it?

If not, why this odd requirement? What problem are you hoping to solve
with it?
 
S

snamds

Hi, I belong to the same development group that tararot2.

Our software must be homolagated by an external company because of
some legislation of our business sector.
The aplicattions we make go to an specific hardware.
The homologation company compiles our code, build the binaries and
homolagates a "crc" of the result.
At this point we could use this binaries that they compiled to be
distributed to this specific hardware but the problems comes in the
future. If a customer report a minor error we have to be able to fix
it, build the binaries, send the result to our customer and after that
send the code to homologate again. The binaries the homolagation
builds must be the same that what we send to our customer.

Now we're trying to achive our goal using mono-cecil.

Thanks in advance
 
L

Lasse Vågsæther Karlsen

Hi, I belong to the same development group that tararot2.

Our software must be homolagated by an external company because of
some legislation of our business sector.
The aplicattions we make go to an specific hardware.
The homologation company compiles our code, build the binaries and
homolagates a "crc" of the result.
At this point we could use this binaries that they compiled to be
distributed to this specific hardware but the problems comes in the
future. If a customer report a minor error we have to be able to fix
it, build the binaries, send the result to our customer and after that
send the code to homologate again. The binaries the homolagation
builds must be the same that what we send to our customer.

Now we're trying to achive our goal using mono-cecil.

Thanks in advance

But the question still stands, after you, or that external company, has
built the code, isn't this binary what you would want to send to a customer?

What I'm saying is that it might be easier for you to just set up
routines to retrieve the binary they build and release this as your own,
instead of trying to circumvent how the linker works.
 
S

snamds

You are right but if there is a bug and I can generate always the
sames binaries, I can fix it and send the binary to my customer in
only one day. If I fix the error, send it to homologate and wait until
the homologation company builds the binary, my customer don't get a
fast support from me. Homologation can be very slow.

It's "just" a question of time. Very important for us and our
customers.
 
J

Jon Skeet [C# MVP]

You are right but if there is a bug and I can generate always the
sames binaries, I can fix it and send the binary to my customer in
only one day. If I fix the error, send it to homologate and wait until
the homologation company builds the binary, my customer don't get a
fast support from me. Homologation can be very slow.

It's "just" a question of time. Very important for us and our
customers.

It's still not at all obvious to me why you can't send the original
file. Why rebuild something in order to produce something that is
guaranteed to be the same as another copy you've already got?

I've a feeling there's something in this situation which you haven't
explained, otherwise it doesn't make sense.
 
S

snamds

I've nothing to hide.
If there is a bug in the code i want to build a fixed version and
install it in the customer ASAP.
After that i'll send my source code to homologate. They'll validate
the source and once validated they'll build it. So if his generated
exe is not identical to the one i sent to my customer, my customer
will not have an homologated version of the software.

When I say an exe is homologated I mean that the "CRC" of the exe is
homologated. A customer CANNOT have an exe whose "CRC" is not
homologated. That's why my generated exe must be identical byte-for-
byte with their generated exe.
 
J

Jon Skeet [C# MVP]

I've nothing to hide.

Sorry, I didn't mean to imply that you did - merely that we hadn't got
enough of the information to understand the problem.
If there is a bug in the code i want to build a fixed version and
install it in the customer ASAP.
After that i'll send my source code to homologate. They'll validate
the source and once validated they'll build it. So if his generated
exe is not identical to the one i sent to my customer, my customer
will not have an homologated version of the software.

Right, I see.
When I say an exe is homologated I mean that the "CRC" of the exe is
homologated. A customer CANNOT have an exe whose "CRC" is not
homologated. That's why my generated exe must be identical byte-for-
byte with their generated exe.

Right. In that case, I don't think the question as originally asked
would actually help you. It doesn't help for you to be able to rebuild
the same source to the same binaries multiple times if the homologating
company then uses a different mechanism to compile your code.

Now, three questions arise:

1) If you send the same source to the homologating company twice, do
*they* end up with the same binaries?

2) Are they happy to share their homologation methods with you?

3) If you can produce the same binary as the homologation company,
doesn't that defeat the purpose of the legislation? I would expect the
point to be similar to crypto-signing - e.g. to guarantee that the
homologation company has the original source code.
 
B

Ben Voigt [C++ MVP]

Lasse said:
But the question still stands, after you, or that external company,
has built the code, isn't this binary what you would want to send to
a customer?

They have to prove that the binary installed on the machine matches the
source code.

Legislation of business sector probably means this is an electronic voting
machine, although there could be similar requirements in other markets.
 
S

snamds

Right. In that case, I don't think the question as originally asked
would actually help you. It doesn't help for you to be able to rebuild
the same source to the same binaries multiple times if the homologating
company then uses a different mechanism to compile your code.

This is not a problem. We provide the compiler and method of
compilation to the homologation company. They first validate the
method of compilation.
Now, three questions arise:

1) If you send the same source to the homologating company twice, do
*they* end up with the same binaries?
2) Are they happy to share their homologation methods with you?
1 and 2 -> Yes, because of my prior explanation.
3) If you can produce the same binary as the homologation company,
doesn't that defeat the purpose of the legislation? I would expect the
point to be similar to crypto-signing - e.g. to guarantee that the
homologation company has the original source code.

No, because we must explain what we have post changed in the binary
and they validate the method. The don't have to believe in what we
say, they can check the method.

I'm sure this is the way we must do things. The problem is that this
is our first project with C#. Before, with C++ we only had to clear
the timestamp of the binary.


Thanks for your time.
 
S

snamds

Legislation of business sector probably means this is an electronic voting
machine, although there could be similar requirements in other markets.

It's not for a voting system. It's a multiplayer casino-like game
machine.
 
J

Jon Skeet [C# MVP]

This is not a problem. We provide the compiler and method of
compilation to the homologation company. They first validate the
method of compilation.

In that case, would it be possible to have a cache of source code +
binaries, so that it always consults the cache before bothering to
build at all?

The best way of making sure that you get the same result is not to
rebuild unnecessarily :)

I do understand that that may not be feasible, however.
1 and 2 -> Yes, because of my prior explanation.


No, because we must explain what we have post changed in the binary
and they validate the method. The don't have to believe in what we
say, they can check the method.

I'm sure this is the way we must do things. The problem is that this
is our first project with C#. Before, with C++ we only had to clear
the timestamp of the binary.

The CLI spec gives details of the PE format, but it's possible that
there are some implementation-specific fields that are being generated
every time.

By the way, I wouldn't set the GUIDs and times to zero - I'd choose
appropriate values and always reuse those.

Note that if you sign your assemblies, all of this editing will
probably (hopefully, even) invalidate the signature - in which case the
suggestion at the top of this post is probably the only practical one.


Would it be possible to make the CRC comparison ignore specific parts
of the binaries, so long as you could explain that they don't affect
the behaviour of the code? That way we wouldn't need to work out what
values to use for the new binaries, just reasons why they're not
important.
 
S

snamds

The CLI spec gives details of the PE format, but it's possible that
there are some implementation-specific fields that are being generated
every time.

We want this to be our last alternative.
By the way, I wouldn't set the GUIDs and times to zero - I'd choose
appropriate values and always reuse those.

You are right. We are setting zeroes in our tests. The idea is to
reuse good values as you say.
Note that if you sign your assemblies, all of this editing will
probably (hopefully, even) invalidate the signature - in which case the
suggestion at the top of this post is probably the only practical one.

We don't need to sign the assemblies.
Would it be possible to make the CRC comparison ignore specific parts
of the binaries, so long as you could explain that they don't affect
the behaviour of the code? That way we wouldn't need to work out what
values to use for the new binaries, just reasons why they're not
important.

This is not a solution, they must CRC it easily. We can't control this
process.

We are getting good results using mono-cecil. It's almost done.

Thank you all very much.
 
J

Jon Skeet [C# MVP]

We want this to be our last alternative.

Consulting the spec as the last alternative? To be honest, it'll be a
lot more authoritative than other answers here.

Another alternative might be to look at the Mono compiler source code -
that will hopefully make it clear what belongs where.
You are right. We are setting zeroes in our tests. The idea is to
reuse good values as you say.
Goodo.


We don't need to sign the assemblies.

That definitely makes it simpler.
This is not a solution, they must CRC it easily. We can't control this
process.

We are getting good results using mono-cecil. It's almost done.

Okay, glad to hear it's going well.
 
S

snamds

Consulting the spec as the last alternative? To be honest, it'll be a
lot more authoritative than other answers here.

I meant we prefer to use well tested third libraries rather than
manually parse the binaries by ourselves.
Another alternative might be to look at the Mono compiler source code -
that will hopefully make it clear what belongs where.

We have made changes to the mono-cecil source code to achieve our
goal.
That definitely makes it simpler.

I don't get the point. How can signing the assemblies would help us?
We can't change the homologation rules. It must be done by CRC as i've
explained.
 
J

Jon Skeet [C# MVP]

I meant we prefer to use well tested third libraries rather than
manually parse the binaries by ourselves.

Right - yes, that's a laudable tactic.
We have made changes to the mono-cecil source code to achieve our
goal.
Right.


I don't get the point. How can signing the assemblies would help us?
We can't change the homologation rules. It must be done by CRC as i've
explained.

No, the fact that the assemblies *aren't* signed makes it simpler.
 
B

Ben Voigt [C++ MVP]

I don't get the point. How can signing the assemblies would help us?
We can't change the homologation rules. It must be done by CRC as i've
explained.

CRC is not a good hash. If you predict the file the homologation procedure
creates you can construct one with arbitrary behavior and the same CRC.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top