jkimbler said:
As part of our QA of hardware and firmware for the company I work for,
we need to automate some testing of devices and firmware. Since not
everybody here knows C#, I'm looking to create a new scripting
language that makes writing automated tests simpler. Really, I'm
looking to kind of abstract the power of the C# language into a
simpler language that's easier to learn. The script files would be
interpreted by a script interpreter written in C#. This interpreter
would do the background work of connecting to the devices, storing
variables, comparing values, hitting the database, etc. and return a
log that proves pass or fail of the test.
I've read your other replies in this thread that predate my reply. I
know you've read about writing a compiler for .NET. Writing a modern
scripting language (as opposed to something like classic BASIC with line
numbers, or DOS-style batch scripting) is just like writing a compiler
for the lexical analysis and parsing stages. There's lots of other
references online for those bits, and they're pretty simple once you get
the ideas right, so I won't go into them in detail. The difference comes
in what you do with the parse tree when you're done.
There's two simple approaches:
1) Create a visitor, acting much like a code generator, which recurses
over the tree and executes the syntax tree's nodes using program state.
So if you have source that looks like this (completely hypothetical
language but hopefully easily understood):
read a
if a == 42 then
print "you picked the magic number"
fi
for i = 1 to a do
print "blah"
od
.... you might then create a tree which looks a bit like this (using Lisp
s-exprs to denote trees as lists of lists):
(statement-list
(read a)
(if (= a 42) // predicate part
(statement-list // true part
(print "you picked the magic number"))
(null-statement) // false part
)
(for i 1 a // variable from to
(statement-list // body
(print "blah"))
)
)
If you then have a visitor (perhaps following the visitor pattern, maybe
the nodes in the above tree are different classes), you should perform
different actions on each node. Imagine a general 'interpret' method
which took a node of the above tree as input, and returned the 'return
value', i.e. the evaluated value of the tree, as output (also assuming
the interpret method has access to world / stack scopes, maybe as a
Stack<Dictionary<string,string>> or whatever you like, to map variable
names to values):
case node of
read:
Console.ReadLine()
store return value in variable named in node
return null (or maybe the value read, depending on
language semantics)
if:
recurse into predicate and save return value
if return value is true then recurse into true-part
otherwise recurse into false-part
statement-list:
iterate through each statement and recurse into each
in turn
null-statement:
return null (do nothing)
for:
check 'to' >= 'from' value, if not then return
set loop variable to 'from' value
1. recurse into body
2. increment loop variable
3. check loop variable <= 'to' value; if not then return
4. go to step 1
print:
recurse into argument and Console.WriteLine return value
=:
evaluate (recurse) first and second arguments and
return result of comparison
<literal>:
return value of literal (like '42', etc.)
<variable>:
return value of variable, looking it up in scope stack
// etc.
esac // end of 'case'
2. A higher-performance approach, and actually simpler in some cases, is
to generate code for a stack machine while recursing through the tree.
It's usually simpler when you need multiple return-value semantics;
consider how you'd need to interpret a 'break out of loop' command in
the 'for' node using the tree evaluation above.
With System.Reflection.Emit, it's pretty trivial to encode most logic.
However, writing a simple stack machine which works with more general,
'scripting style' variant variables is very easy.
The code generator looks pretty similar to the evaluation visitor
pattern above, except this time you're not evaluating and returning
values, but instead writing out stack commands to perform the work. So,
the above method from the evaluator visitor might look a bit like this
instead for code generation:
case node of
read:
push appropriate scope variable onto stack
push name of variable
generate call to Console.ReadLine()
call the 'SetValue' method on the scope
(following the normal IL technique for calling instance methods)
if:
recurse into predicate
create label IfFalse
(e.g. see ILGenerator.DefineLabel in MSDN docs)
generate 'jump if false' to IfFalse label
recurse into true-part
create label AfterIf
generate unconditional jump to AfterIf
mark label IfFalse
(e.g. see ILGenerator.MarkLabel)
recurse into false-part
mark label AfterIf
statement-list:
iterate through each statement and recurse into each
in turn
null-statement:
return (do nothing)
for:
create labels LoopTop, LoopCheck and AfterLoop
create a 'loop frame', with references to these labels,
and push it onto a 'loop stack' available to the code
generator, if you want to support 'continue' and 'break'
statements; these guys would be implemented with jumps
to the above labels.
generate code to set loop variable to initial value
generate jump to LoopCheck
mark label LoopTop
recurse into body
mark label LoopCheck
generate code to jump to LoopTop if variable still
<= 'to' value
mark label AfterLoop
pop loop frame (if any)
print:
recurse into argument
generate call to Console.WriteLine
=:
recurse into first
recurse into second
generate call to your comparison method
<literal>:
generate code to push literal onto stack
<variable>:
generate code to load variable value onto stack
// etc.
esc // end of 'case'
I hope the above sketch gives you some ideas to work with.
-- Barry