C# Grammar issues

MBR · Mar 9, 2007

Hello... I'm using the grammar at:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/csspec/html/vclrfcsharpspec_c.asp
as a reference in creating my own C# parser using a custom framework.
(Please let me know if there's a better group to post in.)

Some questions:
(1) Is this grammar specification known to be complete and correct?
(2) Is there a normalized LL grammar available already suited for
(backtracking) recursive decent systems?
(3) Much of the grammar can be simplified by using EBNF-style
specifications -- it would be nice to find one this way already
(5) Is there a "parameterized" version that allows for c# 2.0 and c# 3.0,
with and without managed extensions?
(6) Are there alternate sources? (I've found some incomplete grammars and
some that are already re-purposed to the point of being unreadable.)

(7) There are simple and some not so simple left-recursions in the grammar.

The simple, direct recursions can be changed to EBNF-style repetitions
without left recursion:

multiplicative-expression:
unary-expression
multiplicative-expression * unary-expression
multiplicative-expression / unary-expression
multiplicative-expression % unary-expression

becomes (I think):

multiplicative_expression ::=
(unary_expression, "*" )* , unary_expression |
(unary_expression, "/" )* , unary_expression |
(unary_expression, "%" )* , unary_expression.

But there are also some very deep recursions such as this one (of many that
can be detected):

type -->
| reference_type -->
| | array_type -->
| | | non_array_type -->
< < < < type <-- Recursive

I'm wondering if this is necessary or even correct. Unlike the direct
recursions, in some of these cases it's hard to tell what's "meant" making
re-writes difficult.

Any pointers/advice appreciated...
thanks,
mike

Tom Dacon · Mar 9, 2007

This is nowhere near an answer to your question, but I'm curious to know why
you're trying to write your own C# parser.

Since this is essentially a proprietary language (submissions to ECMA
notwithstanding) the behavior of MS's parser is by definition the "official"
behavior, and the inevitable ambiguous details of implementation are far
from completely documented. So trying to come up with a functionally
equivalent parser in the absence of the internal MS code is a fool's errand.

So why waste your time? And, in fact, what's the point?

But nevertheless, this is a serious question, not a flame, and I hope you
will see your way clear to respond.

Tom Dacon
Dacon Software Consulting

MBR · Mar 9, 2007

Tom Dacon said:
This is nowhere near an answer to your question, but I'm curious to know
why you're trying to write your own C# parser.

Since this is essentially a proprietary language (submissions to ECMA
notwithstanding) the behavior of MS's parser is by definition the
"official" behavior, and the inevitable ambiguous details of
implementation are far from completely documented. So trying to come up
with a functionally equivalent parser in the absence of the internal MS
code is a fool's errand.

You may be right, although I hope it's not the case - considering the ECMA
submission as you mentioned.
I knm

So why waste your time? And, in fact, what's the point?

There are many reasons why one would want to do such a thing: as a general
exersize, to understand various parsing systems/tradeoffs, to have a system
that exists outside of the MS development environment, to have a system that
exhibits specialized/particular behaviors that 3rd party systems don't
support, etc. -- my answer is some percentage of each of these. C# is an
initial target language (and one I use often), so that's why I'm starting
with it.

thanks
m

But nevertheless, this is a serious question, not a flame, and I hope you
will see your way clear to respond.

Tom Dacon
Dacon Software Consulting

Jon Skeet [C# MVP] · Mar 9, 2007

Tom Dacon said:
This is nowhere near an answer to your question, but I'm curious to know why
you're trying to write your own C# parser.

Since this is essentially a proprietary language (submissions to ECMA
notwithstanding) the behavior of MS's parser is by definition the "official"
behavior, and the inevitable ambiguous details of implementation are far
from completely documented. So trying to come up with a functionally
equivalent parser in the absence of the internal MS code is a fool's errand.

So why waste your time? And, in fact, what's the point?

But nevertheless, this is a serious question, not a flame, and I hope you
will see your way clear to respond.

Would you ask the same question of the Mono team? Just as an example of
why someone might want to do it...

Laura T. · Mar 9, 2007

Don't know if the link is the current.
I use the specs from here
http://msdn2.microsoft.com/en-us/netframework/aa569283.aspx.

You can find more here (like EBNF style C# grammar)
http://dotnet.jku.at/Projects/Rotor/2.0b/HowTo.html.

Some other links you might find useful:
http://www.antlr.org/grammar/list
http://www.ssw.uni-linz.ac.at/Research/Projects/Coco/

Laura T. · Mar 9, 2007

There are a few nice things you can make from a C# parser.
Like static analysis tools, executing C# files as script (C#.Script),
automated testing etc.

Tom Dacon said:
This is nowhere near an answer to your question, but I'm curious to know
why you're trying to write your own C# parser.

Since this is essentially a proprietary language (submissions to ECMA
notwithstanding) the behavior of MS's parser is by definition the
"official" behavior, and the inevitable ambiguous details of
implementation are far from completely documented. So trying to come up
with a functionally equivalent parser in the absence of the internal MS
code is a fool's errand.

So why waste your time? And, in fact, what's the point?

But nevertheless, this is a serious question, not a flame, and I hope you
will see your way clear to respond

Tom Dacon
Dacon Software Consulting

MBR · Mar 9, 2007

Thanks for the response. I found most of these during my original search,
but not all.
See notes below:

Laura T. said:
Don't know if the link is the current.
I use the specs from here
http://msdn2.microsoft.com/en-us/netframework/aa569283.aspx.

This is great. The microsoft link I found seems both stale and outright
wrong -- I'm not sure it addresses all the problems, but at least it doesn't
contain the one suspect recursion I noted below.

You can find more here (like EBNF style C# grammar)
http://dotnet.jku.at/Projects/Rotor/2.0b/HowTo.html.

Given the description, this is close to what I've been looking for; however,
the C# link says "Coming Soon", and all releated links seem long dead. Do
you have an alternate, active link or a copy of this grammar?

Some other links you might find useful:
http://www.antlr.org/grammar/list
http://www.ssw.uni-linz.ac.at/Research/Projects/Coco/

These are a little harder to follow in general (unless you happen to be
antlr or coco), but are a great resourse to go to when there's a specific
issue with a definition.

It looks like multiple documents will need to be harvested, but that I
should be able to get 'er done...

thanks,
m

Shawn B. · Mar 10, 2007

There are a few nice things you can make from a C# parser.

Like static analysis tools, executing C# files as script (C#.Script),
automated testing etc.

And a new parser from which to test out new ideas. It was in writing a C#
compiler that I was able to extend the compiler to provide new keywords that
drastically later the compiled binary... the keywords "parallel" for
executing another function or code block concurrently (automatically
providing the appropriate syncronizations, if any, and even able to
determine when to use an interlockedincrement or a reader-writer lock,
etc.), an "async" keyword for executing a function as an asyncronous
delegate instead, and a "distributed" keyword for executing the function on
another machine in parallel as a grid cluster. The "inline" keyword for
better allowing me to express that I want the contents of a function to be
inlined instead (I have a need).

I was able to provide a special extensibility point in my compiler to allow
me to extend its optimizer and language features with plugins. Using this,
I'm experimenting with DSL extensions to get LINQ like capabilities (C# 3.0
features) and other types of things so I can mix logic in ala ProLog, among
other things. Purely academic on my part, but very fascinating.

My C# parser is not the complete spec and I'm hardly a compiler guru, in
fact, my code probly stinks, but it does allow me to experiment with ideas.

Anyway, there are many reasons why one would want to write a new C# parser
or have a comlete grammer.

In my case, it was just so I can test out some ideas, though I'd like to use
it in production code some day, I do have internal utilities developed using
it but nothing production ready.

Thanks,
Shawn

Tom Dacon · Mar 10, 2007

Would you ask the same question of the Mono team? Just as an example of

why someone might want to do it...

Well, actually I would. In my individual and perhaps extreme minority
opinion, Mono is and always will be lame. It'll always be at least one step
behind where MS is taking DotNet, usually more, no matter how energetic and
dedicated the implementors are. I have no doubt that Miguel de Acaza is a
real smart guy (is he still involved?), and I further have no doubt that the
contributors have been enjoying the challenges of reverse-engineering all
MS's DotNet stuff and have probably learned a lot about compilers and
runtime library development and so forth.

But still. How depressing it would become in the long run, to be constantly
reacting to what someone else does, instead of building something new.You
know, DotNet was in the labs at MS for years before it ever saw the light of
day, and they've got an enormous budget for that sort of stuff. I don't know
for sure, and I'm not so interested in it as to try to follow the Mono
blogs, but I'd bet that there's a lot of burnout in the people who are doing
this.

So yeah. I would ask the same question of the Mono team.

Tom Dacon
Dacon Software Consulting

Tom Dacon · Mar 10, 2007

I was able to provide a special extensibility point in my compiler to
allow me to extend its optimizer and language features with plugins.
Using this, I'm experimenting with DSL extensions to get LINQ like
capabilities (C# 3.0 features) and other types of things so I can mix
logic in ala ProLog, among other things. Purely academic on my part, but
very fascinating.

OK. This is some cool stuff. I'm a believer.

Tom

Jon Skeet [C# MVP] · Mar 11, 2007

Tom Dacon said:
Well, actually I would. In my individual and perhaps extreme minority
opinion, Mono is and always will be lame. It'll always be at least one step
behind where MS is taking DotNet, usually more, no matter how energetic and
dedicated the implementors are.

Well, Mono had a release with generics in before MS had actually
released .NET 2.0, I believe...

I have no doubt that Miguel de Acaza is a
real smart guy (is he still involved?), and I further have no doubt that the
contributors have been enjoying the challenges of reverse-engineering all
MS's DotNet stuff and have probably learned a lot about compilers and
runtime library development and so forth.

They've also provided a useful platform for easier development on
Linux. That's not exactly an inconsiderable achievement.

But still. How depressing it would become in the long run, to be constantly
reacting to what someone else does, instead of building something new.You
know, DotNet was in the labs at MS for years before it ever saw the light of
day, and they've got an enormous budget for that sort of stuff. I don't know
for sure, and I'm not so interested in it as to try to follow the Mono
blogs, but I'd bet that there's a lot of burnout in the people who are doing
this.

There's more to Mono than just the stuff that MS does. GTK# is one
example, for instance.

I don't see anything useless about the Mono project though, when it
comes to aiding Linux development.

C# Grammar issues

MBR

Tom Dacon

MBR

Jon Skeet [C# MVP]

Laura T.

Laura T.

MBR

Shawn B.

Tom Dacon

Tom Dacon

Jon Skeet [C# MVP]