Q: implementing expression logic - hard coded or dynamic

  • Thread starter Thread starter Jonathan
  • Start date Start date
J

Jonathan

Hi all,

I have a file consisting fixed width records from which I need to
extract only those lines meeting certain conditions.


These conditions do change and I find myself recoding/compiling for each
set of conditions then running again.

e.g.

scenario 1:
extract lines where field1=100023 and field3 = "PP"

scenario 2:
extract lines where field1 in (1234,2335,12123,213213)

scenario 3:
extract lines where field3 in (PP,OP,TD,WL) and field4 =04 OR (field3 =
PA and field5 = 03)


As you see it can easily be a non-trivial expression that determines
which lines need to be extracted - this is making it very difficult to
code for every eventuality. It's very SQL-esque but it's not possible to
use SQL in this case - the prog reads and processes each line via a stream.


I was thinking about creating an expression builder, allowing the user
to define the expression with any nesting etc and incorporating this
into the code in some dynamic way at runtime.

I cannot seem to work out how i might do this, though.

Anyone have an inkling on how this might best be done?

Grateful for your thoughts

Regards
Jon
 
I was thinking about creating an expression builder, allowing the user
to define the expression with any nesting etc and incorporating this
into the code in some dynamic way at runtime.

I cannot seem to work out how i might do this, though.

Anyone have an inkling on how this might best be done?

If this is an internal tool and you can trust the input (not
necessarily to be correct, but to not do anything insecure etc) you
could put C# code snippets in your config file and then use
CSharpCodeProvider to build the snippets into types with appropriate
context.

LINQ to Objects makes it very easy to express this sort of logic in C#.

(Of course, the latter relies on you running .NET 3.5 so as to compile
C# 3 code at execution time...)
 
There are expression parsers available. As it happens, I'm working on
just such at the moment, but it isn't a postable state yet...
However, if .NET 3.5 is an option, there is an example (in the samples
download) of a dynamic LINQ expression parser that might be suitable.

Marc
 
Jon said:
If this is an internal tool and you can trust the input (not
necessarily to be correct, but to not do anything insecure etc) you
could put C# code snippets in your config file and then use
CSharpCodeProvider to build the snippets into types with appropriate
context.
Using CSharpCodeProvider takes a little care, though. If you only compile
the code once at startup, it should be no problem, but if you're going to be
recompiling it during runtime you should be aware that loaded assemblies
cannot be unloaded -- they'll sit there and take up memory. You can overcome
this by loading them into a separate AppDomain and unloading this when
you're done, but then you have to deal with marshalling issues. In addition,
compiling code to assemblies and loading these is atrociously slow compared
to using a custom parser, so some form of caching is usually needed to
prevent unnecessary recompiles.

In short, if you need to compile more than once, this solution is not as
convenient as it sounds.
LINQ to Objects makes it very easy to express this sort of logic in C#.
The OP's problem is practically what LINQ was invented for, so I'd certainly
look into it first. Upgrading to .NET 3.5 is a relatively painless affair,
especially if it's a new project.
 
Jon said:
If this is an internal tool and you can trust the input (not
necessarily to be correct, but to not do anything insecure etc) you
could put C# code snippets in your config file and then use
CSharpCodeProvider to build the snippets into types with appropriate
context.

LINQ to Objects makes it very easy to express this sort of logic in C#.

(Of course, the latter relies on you running .NET 3.5 so as to compile
C# 3 code at execution time...)


Jon,

Much obliged for your thoughts and direction. I've had a quick foray
into the world of dynamic code generation in C# and it looks both
interesting and promising.

If my understanding is correct the following will be true/possible:

o I would need to create a stand-alone utility from generated code.
o I could not "inline" the created code within the current utility
ergo, I would, in effect, call-out to this having compiled it
o I would be able to create complex conditional filters using a simple
library of code snippets and appropriate glueing.
o I will be able to use .NET 2.0


I also had a look at Linq and, whilst it looks like a very useful tool,
it appears to require in-memory data sets. In this case that wouldn't be
practical since the files I am processing contain c. 5 million lines at
c. 1GB and probably wouldn't squeeze into working memory. Correct me if
I have made the wrong assumption!

I have perhaps 3 weeks' experience with C# (although to mitigate I do
have a comp. sci background.) so my attempt to bring this to fruition
will be peppered with a large sprinkling of optimism!

Kind regards
Jonathan
 
Marc said:
For info, I found a link to the LINQ expression parser on ScottGu's
Blog:
http://weblogs.asp.net/scottgu/arch...t-1-using-the-linq-dynamic-query-library.aspx

Marc

Marc,

Much obliged for your input.

I took a quick look at Linq following your and Jon's first posts but,
having seen what Scott has to say in his blog, I will spend a little
more time exploring this.

Regarding the data set I am using, it is semi-structured - each line
contains a message which has a predefined structure according to its
content type. Each line, however, may have a different content type and,
as such, doesn't follow the pattern expected of a single set of data.

I initially discounted Linq on the understanding that it could be
applied only to in-memory structured data sets but that may have been
the wrong conclusion on my part.

Thanks again, I have much more to look at now and don't feel at such a
dead end.

Regards
Jonathan
 
Jeroen said:
Using CSharpCodeProvider takes a little care, though. If you only
compile the code once at startup, it should be no problem, but if you're
going to be recompiling it during runtime you should be aware that
loaded assemblies cannot be unloaded -- they'll sit there and take up
memory. You can overcome this by loading them into a separate AppDomain
and unloading this when you're done, but then you have to deal with
marshalling issues. In addition, compiling code to assemblies and
loading these is atrociously slow compared to using a custom parser, so
some form of caching is usually needed to prevent unnecessary recompiles.

In short, if you need to compile more than once, this solution is not as
convenient as it sounds.

The OP's problem is practically what LINQ was invented for, so I'd
certainly look into it first. Upgrading to .NET 3.5 is a relatively
painless affair, especially if it's a new project.

Jeroen,

Thanks for your insight into this area. I had already posted a reply to
Jon summarising what I had understood about it. My thoughts appear to be
a little naive in hindsight.

I will take a more in-depth look at both areas of functionality and work
out how I might apply one or both of them.

Many thanks again - it really helps to get personal input rather than
struggling alone with what are invariably terse tomes!

Regards,
Jonathan
 
Jonathan said:
Much obliged for your thoughts and direction. I've had a quick foray
into the world of dynamic code generation in C# and it looks both
interesting and promising.

If my understanding is correct the following will be true/possible:

o I would need to create a stand-alone utility from generated code.

No, I don't think so.
o I could not "inline" the created code within the current utility
ergo, I would, in effect, call-out to this having compiled it

Nah. Compile to an in-memory assembly and you're away.

As an example, see my "snippet compiler" (Snippy) which you can
download (including source code) from
http://csharpindepth.com/Downloads.aspx

It lets you type in some code, then compiles and executes it
immediately (when you press a button).
o I would be able to create complex conditional filters using a simple
library of code snippets and appropriate glueing.

Yes - or use LINQ :)
o I will be able to use .NET 2.0

So long as you don't need C# 3 features (such as LINQ) absolutely.
I also had a look at Linq and, whilst it looks like a very useful tool,
it appears to require in-memory data sets. In this case that wouldn't be
practical since the files I am processing contain c. 5 million lines at
c. 1GB and probably wouldn't squeeze into working memory. Correct me if
I have made the wrong assumption!

Definitely. It's one of the common misunderstandings about LINQ to
Objects - the processing happens in memory, but unless you invoke any
operators which require the whole data sequence (e.g. ordering) you can
process the data as a stream. See

http://msmvps.com/blogs/jon.skeet/archive/2008/01/08/linq-to-objects-
not-just-for-in-memory-collections.aspx

for an example of what I mean (which may be quite like your situation).
I have perhaps 3 weeks' experience with C# (although to mitigate I do
have a comp. sci background.) so my attempt to bring this to fruition
will be peppered with a large sprinkling of optimism!

Best of luck - and we're ready to help :)
 
Jon said:
No, I don't think so.


Nah. Compile to an in-memory assembly and you're away.

As an example, see my "snippet compiler" (Snippy) which you can
download (including source code) from
http://csharpindepth.com/Downloads.aspx

It lets you type in some code, then compiles and executes it
immediately (when you press a button).


Yes - or use LINQ :)


So long as you don't need C# 3 features (such as LINQ) absolutely.


Definitely. It's one of the common misunderstandings about LINQ to
Objects - the processing happens in memory, but unless you invoke any
operators which require the whole data sequence (e.g. ordering) you can
process the data as a stream. See

http://msmvps.com/blogs/jon.skeet/archive/2008/01/08/linq-to-objects-
not-just-for-in-memory-collections.aspx

for an example of what I mean (which may be quite like your situation).


Best of luck - and we're ready to help :)

Jon,

Many thanks again.

All good news as far as I'm concerned - each method looks to be quite
versatile and able to meet my needs.

So, in a masochistic yet familiar manner, I shall be attempting both then.

Optimism abounds eh?


Regards
Jonathan
 
All good news as far as I'm concerned - each method looks to be quite
versatile and able to meet my needs.

So, in a masochistic yet familiar manner, I shall be attempting both then.
:)

Optimism abounds eh?

All sounds good to me. Let us know how you get on - good luck!

Jon
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top