Regex to retain only the HTML body

K

Karch

If you run this:

string result = "<html><head></head><body>The body</body></html>";
result = retainBody.Replace(result, "$1");


With the following Regex:

private static readonly Regex retainBody = new
Regex(@"<\s*body[^>]*>(.*)<[\s/]*body[^>]*>", RegexOptions.Compiled |
RegexOptions.IgnoreCase | RegexOptions.Singleline);


You get this as the return:

<html><head></head>The body</html>

I want this instead:

The body
 
N

Nikola Stjelja

Karch said:
If you run this:

string result = "<html><head></head><body>The body</body></html>";
result = retainBody.Replace(result, "$1");


With the following Regex:

private static readonly Regex retainBody = new
Regex(@"<\s*body[^>]*>(.*)<[\s/]*body[^>]*>", RegexOptions.Compiled |
RegexOptions.IgnoreCase | RegexOptions.Singleline);


You get this as the return:

<html><head></head>The body</html>

I want this instead:

The body
Try this

string result = "<html><head></head><body>The body</body></html>";
Regex reg = new
Regex(@"<\s*body[^>]*>(?<body>(.*))<[\s/]*body[^>]*>");
Match body=reg.Match(result);
Console.WriteLine(body.Groups["body"].ToString());
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Top