PC Review
Forums
Newsgroups
Microsoft DotNet
Microsoft VB .NET
Regex Help!
Forums
Newsgroups
Microsoft DotNet
Microsoft VB .NET
Regex Help!
![]() |
Regex Help! |
|
|
Thread Tools | Rate Thread |
|
|
#1 |
|
Guest
Posts: n/a
|
Hello,
I have a question about a Regex I'm trying to write. I'm trying to strip out the DOCTYPE declaration from a XML document I'm receiving. I've noticed that the files are delivered with different DOCTYPEs though. Some look like this: <!DOCTYPE nitf SYSTEM "nitf.dtd"> which I can strip out with the Regex <!DOCTYPE [^>]*> However, some of the files are delivered like this: <!DOCTYPE nitf SYSTEM "http://dtd.dtd" [ <!ENTITY % xhtml SYSTEM "http://dtd.dtd"> %xhtml; ]> which forced me to write another Regex <!ENTITY [^>]*> to strip out the ENTITY tags. I've also noticed that there can be several more declarations in a DOCTYPE such as ELEMENT, ATTLIST and NOTATION. Does anyone know any way that I can write one Regex that will strip out the entire DTD regardless if it contains any sub declarations? Thanks in advance. |
|
|
|
#2 |
|
Guest
Posts: n/a
|
I found the answer:
<!DOCTYPE[^>]*?(\[(.|\n)*?\]\s*?)*?> However, this doesn't work with the vbscript.dll version 5.1.0.7426 unfortunately. |
|
![]() |
|
| Thread Tools | |
| Rate This Thread | |
|
|

Main Page 

