Weird characters (double question marks) in text

G

Guest

My client has a handful of documents that have either double-question mark or
y-umlaut characters in them. In most cases, the characters can be deleted,
but when you save the document the characters return.

We are working in Word 2003, but the documents could have been originally
created in any version of Word or another word processor.

Usually we can try various combinations of cutting, pasting, formatting, and
so forth to get rid of the characters, but we need a more automated solution.
We haven't found any way to find and replace these characters, and so far my
research hasn't turned up anything predictable enough to use an XSLT
transform.

I've tried to look for patterns and possible solutions by saving the
documents as XML and then saving back in Word format. Sometimes (but not
always) the XML document will not show the characters when opened in Word.
But when the XML document is then saved in Word format the characters will
return.

In all the cases I've looked at, the paragraph containing the characters
also contains an emspace. Occassionally (but not always) replacing the
emspace with two regular spaces or two enspaces will fix the problem. But as
soon as the emspace is inserted again the problem returns.

In most cases I've looked at where saving as XML has no effect or where the
characters return upon saving in Word format, I can see a <w:r> node that has
only one child, <w:rPr>. That node always contains a <w:b-cs/> child, along
with various combinations of <w:b/>, <w:i-cs/>, and <w:i/>.

There are other cases within these same documents where I can see these same
nodes as well as emspaces and there are no unexpected characters at all.

Has anyone else seen this? Is it a known Word bug? Any suggestions on how I
should proceed?
 
G

Guest

Hi Jan,

The biggest clue here is the fact that you are copying and pasting in your
document and you don't know which program generated the documents that you're
copying from. Whenever you're in doubt, you should always copy and then
select Paste Special and select Unformatted Text. As long as they aren't
huge documents, I would simply select the entire document and copy it and
paste it into Wordpad to strip out any formatting and then copy it from there
and paste it into a new Word document. You can then format the document
properly in Word. I hope this has been helpful to you.
 
G

Guest

Cutting the text in question and then pasting as unformatted text only solves
the problem in 1 of the 4 test documents I'm looking at. In the others, the
text looks great right after the paste, but when I save the document the
characters either come back, or I get a proliferation of more double question
mark characters within the range, in different places from where they were
originally.

Also, I did a bit more research with the end users who have encountered this
problem, and it seems that it doesn't appear when they paste in information
from other sources. Rather, the text will look and print fine within the
document for some time and then suddenly one day the ?? characters will show
up within text that hasn't been changed in any other way.

Just to head off the obvious question: I did try doing an Open and Repair on
these documents. It had no effect.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top