Regex on RTF files

G

Ganesh

Hi,

I have a RTF file as follows

{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0Microsoft Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello} World \par}

I want a Regex to get the value next to the \expnd, in this case value is
'1' ({\expnd1 Hello}), but the next '\expnd' is not a valid RTF, so we should
not take that. I need only within the { } and \expnd<value>

I did this in PHP, but couln't do it in c#, Can someone help me ?

Regards,
Ganesh
 
J

Jesse Houwing

Hello Ganesh,
Hi,

I have a RTF file as follows

{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0M
icrosoft Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello}
World \par}

I want a Regex to get the value next to the \expnd, in this case value
is '1' ({\expnd1 Hello}), but the next '\expnd' is not a valid RTF, so
we should not take that. I need only within the { } and \expnd<value>

I did this in PHP, but couln't do it in c#, Can someone help me ?

Regex rx = new Regex(@"\expnd(\d+)", RegexOptions.None);

rx.Match(rtfstring);

Should get you started... But I don't understand what you mean by
- but the next '\expnd' is not a valid RTF
- and I need only within the { }

so maybe you can provide some samples on what to match and what not to match
and why, so that we can help you better.
 
J

Jesse Houwing

Hello Ganesh,
Hi,

I have a RTF file as follows

{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0M
icrosoft Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello}
World \par}

I want a Regex to get the value next to the \expnd, in this case value
is '1' ({\expnd1 Hello}), but the next '\expnd' is not a valid RTF, so
we should not take that. I need only within the { } and \expnd<value>

I did this in PHP, but couln't do it in c#, Can someone help me ?

Regex rx = new Regex(@"\expnd(\d+)", RegexOptions.None);

rx.Match(rtfstring);

Should get you started... But I don't understand what you mean by
- but the next '\expnd' is not a valid RTF
- and I need only within the { }

so maybe you can provide some samples on what to match and what not to match
and why, so that we can help you better.
 
G

Ganesh

Hi Jesse Houwing,

Thanx very much for ur reply. Check my new string.
string str =
@"{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0Microsoft
Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello} {\expnd1 World}
\expnd ganesh\par}";

In the RTF RFC, the keyword \expnd always start with {\expnd<value> <data>}

so in the string there are 2 \expnd which start with { and there is a
another \expnd, which does not start with {, this is not valid in RTF. So
when I do a Regex I want to get only the valid \expnd, which is
{\expnd<value> <data>} and the value next to the \expnd. In the string there
are two values. 1 and 1. So there can be multiple \expnd in a string. I want
to get all the vales next to the \expnd keyword.

This is my probelm. Your sample did work, but how to get all the values ?

Regards,
Ganesh
 
G

Ganesh

Hi Jesse Houwing,

Thanx very much for ur reply. Check my new string.
string str =
@"{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0Microsoft
Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello} {\expnd1 World}
\expnd ganesh\par}";

In the RTF RFC, the keyword \expnd always start with {\expnd<value> <data>}

so in the string there are 2 \expnd which start with { and there is a
another \expnd, which does not start with {, this is not valid in RTF. So
when I do a Regex I want to get only the valid \expnd, which is
{\expnd<value> <data>} and the value next to the \expnd. In the string there
are two values. 1 and 1. So there can be multiple \expnd in a string. I want
to get all the vales next to the \expnd keyword.

This is my probelm. Your sample did work, but how to get all the values ?

Regards,
Ganesh
 
P

Pavel Minaev

Hi Jesse Houwing,

Thanx very much for ur reply.  Check my new string.
string str =
@"{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0Micr osoft
Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello} {\expnd1 World}  
\expnd ganesh\par}";

In the RTF RFC, the keyword \expnd always start with {\expnd<value> <data>}

Then why don't you just match against "{\expnd(\d+)" (i.e. include the
curly brace?)
 
P

Pavel Minaev

Hi Jesse Houwing,

Thanx very much for ur reply.  Check my new string.
string str =
@"{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0Micr osoft
Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello} {\expnd1 World}  
\expnd ganesh\par}";

In the RTF RFC, the keyword \expnd always start with {\expnd<value> <data>}

Then why don't you just match against "{\expnd(\d+)" (i.e. include the
curly brace?)
 
J

Jesse Houwing

Hello Ganesh,

After getting a Match, you can get the next match by calling NextMatch op
your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
{
string value = m.Groups[1].Value;
m= m.NextMatch();
}

Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
{
string value = m.Groups[1].Value;
}

Jesse
 
J

Jesse Houwing

Hello Ganesh,

After getting a Match, you can get the next match by calling NextMatch op
your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
{
string value = m.Groups[1].Value;
m= m.NextMatch();
}

Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
{
string value = m.Groups[1].Value;
}

Jesse
 
G

Ganesh

Hi Jesse Houwing,

Thanx for the code again. check my code:

-- Begin Code --

sRtfData =
@"{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0Microsoft
Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello} {\expnd1 World}
\expnd ganesh\par}";

Match m = rx.Match(sRtfData);

while (m.Success)
{
string value = m.Groups[1].Value;
m = m.NextMatch();
}


-- End Code --

m.Success is always false,

Regards,
Ganesh

Jesse Houwing said:
Hello Ganesh,

After getting a Match, you can get the next match by calling NextMatch op
your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
{
string value = m.Groups[1].Value;
m= m.NextMatch();
}

Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
{
string value = m.Groups[1].Value;
}

Jesse
Hi Jesse Houwing,

Thanx very much for ur reply. Check my new string. string str =
@"{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset
0Microsoft Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello}
{\expnd1 World} \expnd ganesh\par}";

In the RTF RFC, the keyword \expnd always start with {\expnd<value>
<data>}

so in the string there are 2 \expnd which start with { and there is a
another \expnd, which does not start with {, this is not valid in RTF.
So when I do a Regex I want to get only the valid \expnd, which is
{\expnd<value> <data>} and the value next to the \expnd. In the string
there are two values. 1 and 1. So there can be multiple \expnd in a
string. I want to get all the vales next to the \expnd keyword.

This is my probelm. Your sample did work, but how to get all the
values ?

Regards,
Ganesh
 
G

Ganesh

Hi Jesse Houwing,

Thanx for the code again. check my code:

-- Begin Code --

sRtfData =
@"{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0Microsoft
Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello} {\expnd1 World}
\expnd ganesh\par}";

Match m = rx.Match(sRtfData);

while (m.Success)
{
string value = m.Groups[1].Value;
m = m.NextMatch();
}


-- End Code --

m.Success is always false,

Regards,
Ganesh

Jesse Houwing said:
Hello Ganesh,

After getting a Match, you can get the next match by calling NextMatch op
your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
{
string value = m.Groups[1].Value;
m= m.NextMatch();
}

Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
{
string value = m.Groups[1].Value;
}

Jesse
Hi Jesse Houwing,

Thanx very much for ur reply. Check my new string. string str =
@"{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset
0Microsoft Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello}
{\expnd1 World} \expnd ganesh\par}";

In the RTF RFC, the keyword \expnd always start with {\expnd<value>
<data>}

so in the string there are 2 \expnd which start with { and there is a
another \expnd, which does not start with {, this is not valid in RTF.
So when I do a Regex I want to get only the valid \expnd, which is
{\expnd<value> <data>} and the value next to the \expnd. In the string
there are two values. 1 and 1. So there can be multiple \expnd in a
string. I want to get all the vales next to the \expnd keyword.

This is my probelm. Your sample did work, but how to get all the
values ?

Regards,
Ganesh
 
J

Jesse Houwing

Hello Ganesh,

Where is your Regex definition?

Jesse
Hi Jesse Houwing,

Thanx for the code again. check my code:

Match m = rx.Match(sRtfData);

while (m.Success)
{
string value = m.Groups[1].Value;
m = m.NextMatch();
}
Regards,
Ganesh
Jesse Houwing said:
Hello Ganesh,

After getting a Match, you can get the next match by calling
NextMatch op your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
{
string value = m.Groups[1].Value;
m= m.NextMatch();
}
Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
{
string value = m.Groups[1].Value;
}
Jesse
 
J

Jesse Houwing

Hello Ganesh,

Where is your Regex definition?

Jesse
Hi Jesse Houwing,

Thanx for the code again. check my code:

Match m = rx.Match(sRtfData);

while (m.Success)
{
string value = m.Groups[1].Value;
m = m.NextMatch();
}
Regards,
Ganesh
Jesse Houwing said:
Hello Ganesh,

After getting a Match, you can get the next match by calling
NextMatch op your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
{
string value = m.Groups[1].Value;
m= m.NextMatch();
}
Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
{
string value = m.Groups[1].Value;
}
Jesse
 
J

Jesse Houwing

Hello Ganesh,

Ahh I guess I found what was going wrong... even if you use a @" you still
need to escape a single backslash to let it match a \... stupid me ;)

Regex rx = new Regex(@"{\\expnd(\d+)", RegexOptions.None);

Should work...

Jesse

Hi Jesse Houwing,

Thanx for the code again. check my code:

Match m = rx.Match(sRtfData);

while (m.Success)
{
string value = m.Groups[1].Value;
m = m.NextMatch();
}
Regards,
Ganesh
Jesse Houwing said:
Hello Ganesh,

After getting a Match, you can get the next match by calling
NextMatch op your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
{
string value = m.Groups[1].Value;
m= m.NextMatch();
}
Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
{
string value = m.Groups[1].Value;
}
Jesse
 
J

Jesse Houwing

Hello Ganesh,

Ahh I guess I found what was going wrong... even if you use a @" you still
need to escape a single backslash to let it match a \... stupid me ;)

Regex rx = new Regex(@"{\\expnd(\d+)", RegexOptions.None);

Should work...

Jesse

Hi Jesse Houwing,

Thanx for the code again. check my code:

Match m = rx.Match(sRtfData);

while (m.Success)
{
string value = m.Groups[1].Value;
m = m.NextMatch();
}
Regards,
Ganesh
Jesse Houwing said:
Hello Ganesh,

After getting a Match, you can get the next match by calling
NextMatch op your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
{
string value = m.Groups[1].Value;
m= m.NextMatch();
}
Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
{
string value = m.Groups[1].Value;
}
Jesse
 
G

Ganesh

Hi Jesse,

That's worked..... Thanx very much for your help. At last my probelm solved.

And thanx for every one


Regards,
Ganesh

Jesse Houwing said:
Hello Ganesh,

Ahh I guess I found what was going wrong... even if you use a @" you still
need to escape a single backslash to let it match a \... stupid me ;)

Regex rx = new Regex(@"{\\expnd(\d+)", RegexOptions.None);

Should work...

Jesse

Hi Jesse Houwing,

Thanx for the code again. check my code:

Match m = rx.Match(sRtfData);

while (m.Success)
{
string value = m.Groups[1].Value;
m = m.NextMatch();
}
Regards,
Ganesh
Jesse Houwing said:
Hello Ganesh,

After getting a Match, you can get the next match by calling
NextMatch op your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
{
string value = m.Groups[1].Value;
m= m.NextMatch();
}
Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
{
string value = m.Groups[1].Value;
}
Jesse

Hi Jesse Houwing,

Thanx very much for ur reply. Check my new string. string str =
@"{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fchars
et 0Microsoft Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1
Hello} {\expnd1 World} \expnd ganesh\par}";

In the RTF RFC, the keyword \expnd always start with {\expnd<value>
<data>}

so in the string there are 2 \expnd which start with { and there is
a another \expnd, which does not start with {, this is not valid in
RTF. So when I do a Regex I want to get only the valid \expnd, which
is {\expnd<value> <data>} and the value next to the \expnd. In the
string there are two values. 1 and 1. So there can be multiple
\expnd in a string. I want to get all the vales next to the \expnd
keyword.

This is my probelm. Your sample did work, but how to get all the
values ?

Regards,
Ganesh
:
Hello Ganesh,

Hi,

I have a RTF file as follows

{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fchars
et 0M icrosoft Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1
Hello} World \par}

I want a Regex to get the value next to the \expnd, in this case
value is '1' ({\expnd1 Hello}), but the next '\expnd' is not a
valid RTF, so we should not take that. I need only within the { }
and \expnd<value>

I did this in PHP, but couln't do it in c#, Can someone help me ?

Regex rx = new Regex(@"\expnd(\d+)", RegexOptions.None);

rx.Match(rtfstring);

Should get you started... But I don't understand what you mean by
- but the next '\expnd' is not a valid RTF
- and I need only within the { }
so maybe you can provide some samples on what to match and what not
to match and why, so that we can help you better.
 
G

Ganesh

Hi Jesse,

That's worked..... Thanx very much for your help. At last my probelm solved.

And thanx for every one


Regards,
Ganesh

Jesse Houwing said:
Hello Ganesh,

Ahh I guess I found what was going wrong... even if you use a @" you still
need to escape a single backslash to let it match a \... stupid me ;)

Regex rx = new Regex(@"{\\expnd(\d+)", RegexOptions.None);

Should work...

Jesse

Hi Jesse Houwing,

Thanx for the code again. check my code:

Match m = rx.Match(sRtfData);

while (m.Success)
{
string value = m.Groups[1].Value;
m = m.NextMatch();
}
Regards,
Ganesh
Jesse Houwing said:
Hello Ganesh,

After getting a Match, you can get the next match by calling
NextMatch op your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
{
string value = m.Groups[1].Value;
m= m.NextMatch();
}
Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
{
string value = m.Groups[1].Value;
}
Jesse

Hi Jesse Houwing,

Thanx very much for ur reply. Check my new string. string str =
@"{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fchars
et 0Microsoft Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1
Hello} {\expnd1 World} \expnd ganesh\par}";

In the RTF RFC, the keyword \expnd always start with {\expnd<value>
<data>}

so in the string there are 2 \expnd which start with { and there is
a another \expnd, which does not start with {, this is not valid in
RTF. So when I do a Regex I want to get only the valid \expnd, which
is {\expnd<value> <data>} and the value next to the \expnd. In the
string there are two values. 1 and 1. So there can be multiple
\expnd in a string. I want to get all the vales next to the \expnd
keyword.

This is my probelm. Your sample did work, but how to get all the
values ?

Regards,
Ganesh
:
Hello Ganesh,

Hi,

I have a RTF file as follows

{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fchars
et 0M icrosoft Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1
Hello} World \par}

I want a Regex to get the value next to the \expnd, in this case
value is '1' ({\expnd1 Hello}), but the next '\expnd' is not a
valid RTF, so we should not take that. I need only within the { }
and \expnd<value>

I did this in PHP, but couln't do it in c#, Can someone help me ?

Regex rx = new Regex(@"\expnd(\d+)", RegexOptions.None);

rx.Match(rtfstring);

Should get you started... But I don't understand what you mean by
- but the next '\expnd' is not a valid RTF
- and I need only within the { }
so maybe you can provide some samples on what to match and what not
to match and why, so that we can help you better.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top