PC Review Forums Newsgroups Microsoft DotNet Microsoft VB .NET Regex Help!

Reply

Regex Help!

 
Thread Tools Rate Thread
Old 02-02-2006, 03:32 PM   #1
c
Guest
 
Posts: n/a
Default Regex Help!


Hello,

I have a question about a Regex I'm trying to write. I'm trying to
strip out the DOCTYPE declaration from a XML document I'm receiving.
I've noticed that the files are delivered with different DOCTYPEs
though. Some look like this:

<!DOCTYPE nitf SYSTEM "nitf.dtd">

which I can strip out with the Regex <!DOCTYPE [^>]*>

However, some of the files are delivered like this:

<!DOCTYPE nitf SYSTEM "http://dtd.dtd" [
<!ENTITY % xhtml SYSTEM "http://dtd.dtd">
%xhtml;
]>

which forced me to write another Regex <!ENTITY [^>]*> to strip out the
ENTITY tags. I've also noticed that there can be several more
declarations in a DOCTYPE such as ELEMENT, ATTLIST and NOTATION.

Does anyone know any way that I can write one Regex that will strip out
the entire DTD regardless if it contains any sub declarations?

Thanks in advance.

  Reply With Quote
Old 04-02-2006, 08:21 PM   #2
c
Guest
 
Posts: n/a
Default Re: Regex Help!

I found the answer:

<!DOCTYPE[^>]*?(\[(.|\n)*?\]\s*?)*?>

However, this doesn't work with the vbscript.dll version 5.1.0.7426
unfortunately.

  Reply With Quote
Reply



Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off