Byte code interpretation by hand

  • Thread starter Thread starter R.Wieser
  • Start date Start date
R

R.Wieser

Hello All,

I've found a list of CIL byte-code commands on WikiPedia
http://en.wikipedia.org/wiki/List_of_CIL_instructions ), but its a bit
meager. For instance, I've got some code starting with this:

F0 29 00 00 00 00 00 00 48 00 00 00 02 00 05

Alas, the F0 isn't in the beforementioned list, so I have no idea how to
proceede, and could use a hand.

Also, I've tried to find/download "ildasm.exe", but could not find it
anywhere (other than inside a .NET SDK). Any place where I could download
just it (and maybe "ilasm.exe" too) ?

Regards,
Rudy Wieser
 
Hello Peter,
JetBrains makes a good .NET disassembler. You can get it here: <snip>

I've found that one and a few others like it. But I do not really need
decompiled code (a simple disassembly will do fine), and it needs to stay
portable (no installing).
The .NET CLR specification is the place to look for details
about MSIL.

Could you please be a bit more specific about where to find it ? A quick
google does return a number of hits, but the first one mentioning the spec
is someone also looking for it ...
But if you have a real need for ildasm.exe, I don't see the
problem in downloading an Express version of VS or similar
supported tools that would come with it.

My apologies, but I do see a problem. I would be wasting bandwidth and I
would have trouble with cleanly getting rid of such an SDK again. In short:
Its something I will only do as a last resort.

Apart from the problem that at least the VC++ outof the VS 2008 express
edition package does not contain idlasm.exe.

--Newsflash--
Scratch the above "does not contain" statement. I searched from "Local Disk
(C:)" (to be sure I would not miss anything) and it did not turn up
anything. Searching "C:\program files" however did. Don't you love Windows
? :-(

Regards,
Rudy Wieser

P.s.
Yes, I installed VC++ 2008. I had some use for it. :-)


-- Origional message
 
Hello peter (others?),

I've used ildasm to generate a dump of a (very small) .net dll, and
encountered a problem: Although the listing shows all the byte-code it seems
to skip some bytes at the top of each method (the method-start is pointing
to this byte). For instance, this method:

02 - ldarg.0
28 10 00 00 0A - call (0A)000010
2A - ret

actually seems to start with the byte 1E which does not show in the ildasm
dump. I do however see a ".maxstack 8" line as well as a remark mentioning
the code-size of that method (7 bytes)which are possibly taken from it (but
how?).

Another method starts with even more "hidden" bytes:

13 30 03 00 ED 00 00 00 01 00 00 11

That method seems to have its code-size mentioned at offset 4 (0xED), and
possibly its .maxstack value at offset 2 (0x03). The ildasm dump also
mentions 6 local vars ...

And there is the entry-point of the .NET code (withe the label _CorDllMain),
which starts with the earlier mentioned

F0 29 00 00 00 00 00 00

After this the "CLR header" seems to start (0x00000048 - size of the
header).


So, one problem solved -- interpreting the byte-code itself -- but another
popping up: no information/idea how to interpret the "wrappers" the
byte-code is packaged in ...

Do you (or anyone else!) know of a (preferrably complete) document
describing these headers ?

Regards,
Rudy Wieser


-- Origional message
 
Hello Peter,
Fact is, whatever it is you're trying to do, it's highly unusual
for someone to need to know these kinds of specifics about
the byte code.

Not when you are looking at the .NET program-blob at byte-level. As for
that matter, why would anyone have a need for a program like ildasm ? Isn't
knowing that your compiler emits the right byte-code enough ? :-)
But you should expect to have the CLR specification right by your
side in doing so.

Preferrably, yes. That is why I first did google for at least an hour to
try to find some specs. I even tried Yahoo, as is more exact in its
matches. The wikipedia link I posted was the most usefull that I could find
in that time.

Only after I did that I decided it would perhaps be a good idea to ask the
question in a .NET related newgroup.
I know Microsoft's not actually hiding it from view, so it's simply
a matter of entering the right search terms in your favorite web
search engine and sifting through the results.

Yes, I can again throw the words "dotnet" "CLR" "specification" and other
possibly related words in combinations at Google and Yahoo, but I do not
expect more results from it than I got in the last few days (I did not stop
searching for more info after I posted here).

Microsofts pages relating to specific .NET commands did also not contain any
reference/link to a more thoroughly explaning document (yes, I tried that
too).

That was why I asked if you or anybody else had information or (later) a
link handy.

Regards,
Rudy Wieser


-- Origional message:
Peter Duniho said:
[...]
Do you (or anyone else!) know of a (preferrably complete) document
describing these headers ?

As I mentioned, the CLR specification has this information.

I don't have a URL off the top of my head, but I know that I was able to
find it the last time I went looking for it. So you should be able to also.

Fact is, whatever it is you're trying to do, it's highly unusual for
someone to need to know these kinds of specifics about the byte code.
Compilers, run-times, and disassemblers are typically the kinds of tools
which need this kind of thing, and we have those already.

If you're trying to re-implement something like that, more power to you.
But you should expect to have the CLR specification right by your side in
doing so. At this point, your first priority would be to try harder
tracking down the specification.

I know Microsoft's not actually hiding it from view, so it's simply a
matter of entering the right search terms in your favorite web search
engine and sifting through the results. Alternatively, go straight to the
source: contact Microsoft, an employee with a blog, or even an ex-employee
with a blog, directly and see if there's someone there that can point you
in the right direction.

Pete
 
Hello Peter,
"Looking at the .NET program-blob at byte-level"
_is_ highly unusual. QED.

QED ? Just because you say so ? Weak argument dude.

Though I have to grant you that statistically there are quite a few people
who will ust use a tool without being bothered in the slightest about not
what makes it tick. Just assume that I'm one of those people outside the
center of the bell-curve: I'm not a run-of-the-mill user.
... as it is to look at things like the assembly manifest, flags,
references, and member declarations.

Funny, thats what I tried to do too. Does it make any difference how you do
that ? Oh wait, according to you you should only use already made tools.
:-\

Just a question: why are you in this newsgroup ? You sure as heck can't be
doing any programming, 'cause that would violate your own rule of "just use
whats already there". [looking sideways up to the ceiling and whistling]
In any case, a source-level decompiler like DotPeek is IMHO a
much easier and practical approach to those tasks.

You assume too much. I have no use for such re-generated .NET code.
Still, you have not yet shared _why_ it is you feel you need
to look at your program.

Why do you think that is important in regard to what I'm after ? As far as
I know the _why_ of it doesn't change anything to the byte-code descriptions
or the .NET file-format specification itself.

But to humor you: I just want to see how it works on that level. Thats all.
Knowledge.
you may find you get a solution to what you're actually doing
if you explain the higher-level problem that led you to trying
to deal directly in IL byte-code.

I'm not sure what you mean with that "higher-level problem" ... Ment as in
a .NET language programming-problem ? Or ment in a more general way ?

In the first case: no such problem exists. I do not even have .NET
source-code or compiler here.
In the second case: As above, I just want to get a thorough understanding
of it.

To be quite honest, that need for knowledge is fed by seeing .NET
executables and DLLs on my machine that currently are fully opaque to me.
If I had it, I would share it.
Thanks.

Sorry I wasn't able to be more helpful.

And mine if I did not match your idea of what someone posting here should be
like.

Regards,
Rudy Wieser


-- Origional message:
Peter Duniho said:
Hello Peter,


Not when you are looking at the .NET program-blob at byte-level.

"Looking at the .NET program-blob at byte-level" _is_ highly unusual. QED.
As for
that matter, why would anyone have a need for a program like ildasm ? Isn't
knowing that your compiler emits the right byte-code enough ? :-)

It is less "highly unusual" for someone to use ildasm. Yet, that is unusual
in its own right. And most of the time someone is using ildasm, it's not so
much to interpret the program instructions as it is to look at things like
the assembly manifest, flags, references, and member declarations.

In any case, a source-level decompiler like DotPeek is IMHO a much easier
and practical approach to those tasks.

Still, you have not yet shared _why_ it is you feel you need to look at
your program in the context of the raw byte-code and interpret that
byte-code in a way that isn't made possible through ildasm.exe or a tool
like dotPeek.

I may or may not be the last person alive who still likes to use
newsgroups, or this one in particular. But whoever is your audience, you
may find you get a solution to what you're actually doing if you explain
the higher-level problem that led you to trying to deal directly in IL
byte-code.
[...]
That was why I asked if you or anybody else had information or (later) a
link handy.

If I had it, I would share it. But at this point, long after the last time
I had a need to look at the specification, it would take me the same effort
to find it as it would take you.

Sorry I wasn't able to be more helpful.

Pete
 
Hello Peter,
... the highly unusual nature of needing to ...

Please, don't repeat yourself. "highly unusual" ? Because you say so ?

Yes, I can believe that most .NET programmers (including you?) do not have
any urge to know how it all works "under the hood". It could be enough for
them to know "it just works".

But have you considered the possibility that there are other groups and/or
individuals besides .NET programmers ? Either of which do not accept "it
just works" for too long, or who have a profession (or hobby) in looking
below the immediate surface ?
If you think that it's not highly unusual to want to do that, you
might consider presenting your question to the apparent multitudes
of people who are doing the same thing as you.

Ah, you mean that as there are possibly other people like me they must
automatically form a group ? For what reason ? Its higly unlikely
(statistics) that many of us would be busy with the same problem at the same
time (give or take a year), so such a newsgroup would seemingly consist
outof random questions, in a very broad spectrum.

In the light of the above, whats the chance that I would find someone who
is-or-was busy with this same question in such a (non-existing) newsgroup
compared against the chance that someone actually using the language build
ontop of it knows a bit more about it than whats at the direct surface ?

Heck, you mentioned yourself that you saw the specifications to the language
yourself. You just can't remember where. That means that I was *very*
close to getting my answer here, don't you think ?
They seem not to be reading this newsgroup, but they
must be somewhere.

Now you mention it, how is your Yeti-hunt going ? I mean they obviously are
not here, so they *must*, according to your logic, have grouped and be
living somewhere ... :-P
Where you hope to find these multitudes,

Huh ? Where did you get the notion that I was looking for those people ?
Personally I think I made quite clear what I was after, so I really have no
clue to how you came to that conclusion.
Because, all too often someone poses a question asking how
to do X, when what they really want to know is how to do Y.

Yeah, that happens. But in my case what I asked was what I wanted to know.
Did I mention I was out for knowledge ?

To make that more explicit, I was-and-am not out for a solution to a
specific problem, I was-and-am out to understand the CIL, and a bit later on
(when I relized there was more to it than only the byte-code) the
"container" the code and its methods where placed in (and no, I do not mean
the PE one. That one I tackeled a while ago)..
They only _think_ they need to do X because they don't know
enough about Y to know there are better alternatives to
accomplishing it than by doing X

I hope you realize its very dangerous to double-guess the intentions of
people and than blindly heading in the direction of what you think they want
to go to, without even asking them if you're correct in your assumptions ?

And pardon me, but I have now mentioned a few times that you should not
assume so much, so I think that the answer to the above is "no". Oh well,
you're still alive so there is a chance you will, one day, learn it. :-)

Thank you for your time. But I think I'm going to "log off" from this
newsgroup. It currently has nothing I have any use for.

Regards,
Rudy Wieser


-- Origional message:
Peter Duniho said:
Hello Peter,


QED ? Just because you say so ? Weak argument dude.

The only "exception" you stated to the highly unusual nature of needing to
know the specifics was "looking at the .NET program-block at byte-level".
Since that is in and of itself highly unusual, it's not a real exception.
It simply reinforces what I already said.

If you think that it's not highly unusual to want to do that, you might
consider presenting your question to the apparent multitudes of people who
are doing the same thing as you. They seem not to be reading this
newsgroup, but they must be somewhere.

I admit, I am not among those multitudes. I practically never need to look
at the byte code. I look at .NET IL byte code even less than I've looked
the processor machine code over these past few decades or so, and since the
90's that has not been that often either. (I will grant that prior to that,
it was more common...but then our tools got better).

Where you hope to find these multitudes, I have no idea. I'm clearly out of
the loop, having overlooked the vast community of .NET programmers who
regularly inspect the IL byte-code as part of their daily work.
[...]
Still, you have not yet shared _why_ it is you feel you need
to look at your program.

Why do you think that is important in regard to what I'm after ?

Because, all too often someone poses a question asking how to do X, when
what they really want to know is how to do Y. They only _think_ they need
to do X because they don't know enough about Y to know there are better
alternatives to accomplishing it than by doing X.
[...]
But to humor you: I just want to see how it works on that level. Thats all.
Knowledge.

See? Was that all that hard?

Now, I note that many people successfully gain knowledge through a process
known as "reverse-engineering". It is not as efficient as reading
documentation, I grant you. But it does work, and would in this case as
well.
I'm not sure what you mean with that "higher-level problem" ... Ment as in
a .NET language programming-problem ? Or ment in a more general way ?

I mean you have asked to do X. I was asking you to explain the Y that led
you to X.

In your case, it seems you simply want to do X for the sake of doing X,
which is fine. But it's useful for you to clarify that. Without stating
that explicitly, there's always the possibility that what you really need
is a different solution to Y than X.
In the first case: no such problem exists. I do not even have .NET
source-code or compiler here.
In the second case: As above, I just want to get a thorough understanding
of it.

To be quite honest, that need for knowledge is fed by seeing .NET
executables and DLLs on my machine that currently are fully opaque to
me.

If they are opaque, it is only because you apparently don't want to use
tools that would render them transparent. Yes, one way to decipher a .NET
program is to learn the IL byte-code yourself. But a decompiler (like
ildasm, dotPeek, etc.) essentially encodes precisely the information you're
asking for, simply in a different format.

This is not unlike your video card presenting images to you in a format
that is easier to understand than trying to interpret the millions of byte
values that are used to actually describe each frame of video. Or for that
matter, retrieving each individual byte of the .NET program you're
interested in one at a time by directly querying the disk or other storage
in which it resides.

(I'm assuming here that you have used some kind of binary editor/display
program to look at the individual bytes, and are trusting that its
abstraction of the actual raw data stored in your computer system is an
accurate representation. My apologies if in this case also you have gone
"straight to the source", so to speak).

Of course, every person will decide for themselves to what level of
abtsraction they are comfortable with in their computer systems. There's
nothing wrong with wanting to know more about the fine details of what's
inside a .NET program.

But it's fallacious to assume that the _only_ way to achieve this
understanding is through direct inspection of the individual bytes of the
program.

In any case, the bottom line here remains the same: if you want perfect
knowledge of the .NET IL byte code, you need the specification.

It's not being hidden from you. I know it exists, and I know you can find
it if you spend enough time and effort.

Consider that to simply be part of the puzzle you're working on. It's like
the need to actually _open_ the box containing the jigsaw puzzle pieces
before you start working on the puzzle. It may not seem like it's actually
part of working on the puzzle, but it is a prerequisite nevertheless.

If you don't have the specification, then you are necessarily trusting
someone else to represent the information contained within as an
abstraction of some sort. And if you're going to do that, you might as well
trust Microsoft, JetBrains, etc. to do it for you and present the
information in a more readily-consumable fashion.

Find the spec. And for good measure, when you do find the spec, post a
message here with the URL so anyone else looking for it has a better chance
of finding it themselves.

Pete
 
Back
Top