Bizarre wildcard replace

Guest · Jan 26, 2006

Hi, I'm trying to write a basic word -> HTML conversion (since I can't find
any tool that actually does a clean job of it.) For example, I want to find
all instances of italics and replace them with (original text).
I've tried doing this in the search-and-replace:

Turned on "Use Wildcards", selected Format in the 'find' with Font of
"Italic", do a search for '*'.

In the replace with, I've set formatting to "Not Italic," and made the
replace string ^& .

I've also tried it with (*) in the find field and \1 in the replace
field.

My problem is that this seems to only be matching one character at a time --
I end up with and around every individual character, instead of
around an entire word, part of a word, sentence or paragraph.

Any help would be much appreciated.

Jezebel · Jan 26, 2006

Word's implementation of regular expressions (which is what you get with
'Use Wildcards' checked) uses minimal matching, as opposed to Unix which
uses maximal. In other words it looks for the *smallest* sequence of
characters that match the Find expression -- in your case, one character at
a time. Which is a damned nuisance.

However, there's an easy fix, at least for the example you give: if you
don't check the 'Use Wildcards' checkbox you can leave the Find box blank:
it will then match the full sequence with the formatting you specify. In the
replace box use ^& for the 'find what' text. So:

Find: (blank), Format = italic
Replace: ^&, Format = not italic

John McGhie [MVP - Word and Word Macintosh] · Jan 29, 2006

Had you thought of trying a tool named "Microsoft Word 2003"?

It does a *perfect* job of converting a Word document to HTML.

That's a lie

It does a perfect job of converting a Word document into
X-HTML. HTML is not capable of describing a Word document. Word writes XML
inline with the HTML to describe the components of the document that HTML
can not.

If you are working with Word 2003 Enterprise Edition, you will have a tool
named InfoPath available. InfoPath enables you to write an XML Transform
that would remove the various components you do not want from the XML that
Word writes.

Otherwise, you will find any number of tools available that will do a
greater or lesser version of what you are trying to do. FrontPage does an
excellent job of you simply "paste" the text of the Word document into the
FrontPage editor. DreamWeaver does a great job of filtering Word's HTML on
import. I think making your own is very much a case of re-inventing a wheel
that is already trundling down the highway under dozens of cars

Cheers

Hi, I'm trying to write a basic word -> HTML conversion (since I can't find
any tool that actually does a clean job of it.) For example, I want to find
all instances of italics and replace them with (original text).
I've tried doing this in the search-and-replace:

Turned on "Use Wildcards", selected Format in the 'find' with Font of
"Italic", do a search for '*'.

In the replace with, I've set formatting to "Not Italic," and made the
replace string ^& .

I've also tried it with (*) in the find field and \1 in the replace
field.

My problem is that this seems to only be matching one character at a time --
I end up with and around every individual character, instead of
around an entire word, part of a word, sentence or paragraph.

Any help would be much appreciated.

--

Please reply to the newsgroup to maintain the thread. Please do not email
me unless I ask you to.

John McGhie <[email protected]>
Microsoft MVP, Word and Word for Macintosh. Consultant Technical Writer
Sydney, Australia +61 (0) 4 1209 1410

Word Find/Replace Options - wildcards & formatting changes	1	Oct 15, 2009
Macro: Search and replace but retain certain information in result	3	Jun 15, 2009
Wildcard to Find and Replace with fields?	14	May 18, 2010
Find+Replace Dialog Box, Replace with Format problem	3	Jan 31, 2008
Trouble with simple Find/Replace	1	Dec 9, 2007
Head-scratcher concerning wildcards and Word 2007	11	Jun 1, 2009
Find and replace question	2	Jun 21, 2007
Help with wildcard Replace syntax, please?	4	Apr 13, 2010

Bizarre wildcard replace

Guest

Jezebel

John McGhie [MVP - Word and Word Macintosh]

Ask a Question

Similar Threads