Source code obfustication?

  • Thread starter Thread starter kiln
  • Start date Start date
K

kiln

I know mde's scramble code, but are there any products or projects that
obfusticate VBA code out there? I see that FMS has one, does anyone have
experience with it?
 
The best way to obfuscate your code is to compile to MDE, I suspect. I have
not heard of anyone writing an MDE decoder, though I suppose it would be
possible to recreate some code from an MDE -- however, without the
meaningful statement labels, comments, and variable names, it would be
difficult to follow and use elsewhere.

Larry Linson
Microsoft Access MVP
 
Hi Larry

Apparently they can be decoded, sans proper function names etc as you
guessed. MDE does obfusticate, but I'm hoping someone knows if mde
format obviates the utility of a formal obfustication operation, or if
an obfusticator does add to the level of difficulty.
 
Apparently they can be decoded, sans proper function names etc as you
guessed. MDE does obfusticate, but I'm hoping someone knows if mde
format obviates the utility of a formal obfustication operation, or if
an obfusticator does add to the level of difficulty.

.... you could, of course, just port the whole thing to C++ or even to C if
you really want to make it hard to reverse engineer.


Tim F
 
kiln said:
Apparently they can be decoded, sans
proper function names etc as you
guessed. MDE does obfusticate, but
I'm hoping someone knows if mde
format obviates the utility of a formal
obfustication operation, or if an
obfusticator does add to the level
of difficulty.

Do you have a link to anything that does more than speculate about this? I,
truly, had not heard of it actually being done, and thought no one figured
it was worthwhile to invest the time and effort to crack the MDE.

Obfuscators work in a couple of ways: they replace meaningful labels and
variable names with random strings, and, sometimes, they rearrange the code
so anyone trying to decode it has to deal with a lot of GoTo. Additionally,
some add meaningless instructions that can't be reached to execute.

The first of these is handled by the MDE... the meaningful variable names,
labels, and comments are just no longer there. The other two items only
delay what someone is doing, not make it difficult to understand -- my guess
is that most any of us could write code to "straighten" VBA code that has
been shuffled, without much effort.

FMS, Inc. used to have an obfuscator function in their Code Tools product,
before MDEs were introduced. I don't know if Code Tools still includes ones
for people who have a need to distribute MDBs -- you could check it out at
http://www.fmsinc.com.

What are you trying to accomplish? Is your code so valuable that someone
would go to great lengths to steal it? Or are you trying to protect your
code in order to protect your data?

If the first, carefully consider that many experience Access developers can
easily re-create most applications just from observing the forms and
watching the application run. Frankly, it is the business application that
you'd want to protect, and that is, generally, obvious from seeing the
application in action.

If the second, no one really needs to get your code to get your data. If you
let someone have unlimited access to your data, it is just about as close to
certain as can be, that in time they will be able to crack any protection
scheme. The best way to protect data is to house it on your own server, in a
server database that has "robust" protection.

Larry Linson
Microsoft Access MVP
 
... you could, of course, just port the whole thing to C++ or even to C if
you really want to make it hard to reverse engineer.


Tim F
Right, maybe I'll take what's left of the weekend and do that!

As an aside, I've read that anything running client side is ultimately
unencryptable. But there are probably degrees of difficulty. Do you
happen to know if C++ is a lot more difficult to decrypt than Java?
 
Hi Larry

Actually I'd rather not provide any links, as I'm not all that much in
favor of anyone decrypting mde files! I'm not wise on this topic but
others who seem to be gave me the impression that I related. Previous to
about six months ago I also thought that no one had cracked mde
encryption, but I think that's wrong. Aside from any issues with how
well Microsoft implemented mde encryption, there are informed opinions
out there to the effect that anything running on a client pc is
vulnerable to being decrypted. An Access mde would have to be simpler
than something written in Java or C++. But again I'm not an expert on
this stuff. I'm just relating my latest impressions from what I regard
as reliable sources.

The interest is there from a couple of angles. I have clients that
redistribute apps I've built for them, and they want their investment in
the software to be as protected as it can be, within reason. They
understand that Access isn't the top dog in the security arena, but they
want what security they can get. Most clients aren't going to be
interested in putting any or much interest into reverse engineering
these apps. Most could be reconstructed by just seeing how the
application works...none are rocket science, some are clever. One
project I'm involved in that is not Access-based *is* rocket science, it
has original math etc.

And yes I'm aware that protecting back end data is mostl a different
issue. I do appreciate your writing up your thoughts, covering the
bases. MVP's like you deserve the thanks of the rest of us folks, it
makes a big difference.
 
"kiln" wrote

Actually I'd rather not provide any links,
as I'm not all that much in favor of anyone
decrypting mde files!

AFAIK, .mde files are not encrypted, they are "compiled" to "tokenized"
form, in a similar but not identical manner to classic VB executables, up
through V4. There was a "discompiler" written for VB, but I don't think the
authors ever broke even to compensate them for the time they put in writing
it.

Access' "encrypting" doesn't do much except prevent someone from poking
around with a disk editor and seeing the data. You don't supply a key, and
it is built in to Access itself to decrypt.
. . ., there are informed opinions out there
to the effect that anything running on a client
pc is vulnerable to being decrypted.

I think I said essentially the same thing.
An Access mde would have to be simpler
than something written in Java or C++.

I don't think there would be that much difference, except the translation
from MDB to MDE is not documented. I believe the translation from Java
source to executable is docummented -- it is, I believe, interpreted not
executed, just like MDE.

The output from C++ is executable machine language which is well-documented.
There are dis-assemblers which will create source from executable machine
language. Recreating C++ source would be more difficult, and I don't keep up
with that area, so don't know if there are "discompilers" for that purpose.
The interest is there from a couple of
angles. I have clients that redistribute
apps I've built for them, and they want
their investment in the software to be
as protected as it can be, within reason.

FMS' Code Tools writeup on their website indicates it does include "code
obfuscation". If you combine that with compiling to MDE, that's probably
about as good as you are going to get. If you wanted, also, to add Access'
encryption, that would further complicate the life of anyone who just wanted
to take the raw file and hammer it with some program. To get at the files
unencrypted, they would have to get them from within the Access application.

Any complicated calculations could be done in some other language, packaged
as a DLL, and the functions called from Access. I am sure there are schemes
for those other languages to try to make it difficult to regenerate the
source, too.
I do appreciate your writing up your
thoughts, covering the bases. MVP's
like you deserve the thanks of the
rest of us folks, it makes a big
difference.

Thanks for the kind words. It is making a difference for others that makes
it worthwhile for us. You are most welcome.

Larry Linson
Microsoft Access MVP
 
As I seem to recall from taking a Java class six years ago, Java is a
language requiring the app to be compiled. To run the app, the compiled
code is interpretted by the Java Virtual Machine on the PC running the
JAVA app. The JVM is the device which makes Java universal and is buildt
specific to the platform. Hence there would be a JVM for Windows, one
for Macs, one for Unix etc.
 
Wouldn't it just be easier to build an app in VB and then compile it?

There was an invisible sarcasm smiley up there... I strongly suspect that
compiled VB uses a symbol table that would make reverse engineering easy,
while with lower level languages one would have to chip away at the
actual machine instructions to follow what was going on. Hence the
reference to C itself. Even writing in assembler can be reversed.

From: kiln said:
Right, maybe I'll take what's left of the weekend and do that!

Interesting language, C. You can learn it over a weekend, and produce
something useful by the end of, say, eight or nine years...

In the end, any executable has to be readable by the processor, so there
is no such thing as encryption.

I don't really see the point, anyway. By instinct I am with the "free as
in speech" attitude to software development. If there is a really good
solution to a programming problem, then it will have been published
already; if I were to produce a genuinely new solution, I'd be only
flattered if lots of other people used it. My guess is that people who go
to great lengths to hide their brilliance have bugger all to hide.

Just a thought


Tim F
 
As I seem to recall from taking a Java
class six years ago, Java is a
language requiring the app to be compiled.
To run the app, the compiled code is
interpretted by the Java Virtual Machine
on the PC running the JAVA app. The
JVM is the device which makes Java
universal and is buildt specific to the
platform. Hence there would be a JVM
for Windows, one for Macs, one for
Unix etc.

Yep, never trust any use of the word "compiled", these days, to mean what
you think. An Access MDB is "compiled" to MDE, which means "tokenized" with
comments removed, and meaningful labels and variable names replaced. With
Java, it appears to mean "translated to something that the JVM can interpret
on any machine that has a JVM", which probably is very similar to being
"tokenized".

In the dim, dark, distant past in the early days of Problem Oriented
Languages, like COBOL, PL/1, and FORTRAN, "compiled" meant translated from
human readable into executable machine code (just as Assmblers translated a
much lower-level language into executable machine code).

The Dot Net languages take yet another tack... they have a Just-In-Time
complation that converts them into something closer to machine language than
tokenized but not to "real" machine language... this is done just prior to
executing them, so the "compiled code" isn't stored, just regenerated when
needed. Nah, of course it isn't as efficient, but they seem to believe that
they have solved a lot of problems with that approach.

Larry Linson
Microsoft Access MVP
 
Back
Top