String Filter for non-Standard Characters

G

Guest

Using Visual Studio 2005.

I'm using XmlReader to read data into a string variable in C#. Works great!

Every once in a while, I see that the string data contains a symbol that
looks like a square.

Any suggestions on how to filter out characters that don't appear on a
keyboard? I hate to let the end-user see these square characters.
 
P

Paul E Collins

randy1200 said:
I'm using XmlReader to read data into a string variable
in C#. Works great! Every once in a while, I see that the
string data contains a symbol that looks like a square.

Have you checked what these characters actually are? You can get the
Unicode value by doing (int) s , where s is the string and i is the
zero-indexed position of the offending character. For example, they
might be bits of Unix or Apple Mac line feeds, if your file is coming
from another platform, or they might be legitimate foreign characters
that just aren't displayed correctly in your particular font.
Any suggestions on how to filter out characters that don't
appear on a keyboard?

Again, it depends on your keyboard. You could just create a string
containing all the characters you're happy with and use a
StringBuilder to loop over the original string, copying acceptable
characters (acceptableChars.IndexOf(ch) >= 0) to a new target string.

But if all or most of the problem characters are the same one, just
use String.Replace.

Eq.
 
G

Guest

Many thanks for the excellent response. You gave me the tools to figure it out.

The Unicode value of the square was a 10, which is a line feed. If I just
printed my string variable to the console there was no square - I just got a
newline.

The string ultimately gets picked up by Microsoft Word. That's were the
square appeared.

While processing the string, I simply replace the lf with a cr:

string s = "bunch of characters including a line feed";

string lf = (Microsoft.VisualBasic.ControlChars.Lf).ToString();
string cr = (Microsoft.VisualBasic.ControlChars.Cr).ToString();

s = s.replace(lf,cr);

Thanks again,
--
Randy


Paul E Collins said:
randy1200 said:
I'm using XmlReader to read data into a string variable
in C#. Works great! Every once in a while, I see that the
string data contains a symbol that looks like a square.

Have you checked what these characters actually are? You can get the
Unicode value by doing (int) s , where s is the string and i is the
zero-indexed position of the offending character. For example, they
might be bits of Unix or Apple Mac line feeds, if your file is coming
from another platform, or they might be legitimate foreign characters
that just aren't displayed correctly in your particular font.
Any suggestions on how to filter out characters that don't
appear on a keyboard?

Again, it depends on your keyboard. You could just create a string
containing all the characters you're happy with and use a
StringBuilder to loop over the original string, copying acceptable
characters (acceptableChars.IndexOf(ch) >= 0) to a new target string.

But if all or most of the problem characters are the same one, just
use String.Replace.

Eq.
 
P

Paul E Collins

Good to hear you got it working.
string lf = (Microsoft.VisualBasic.ControlChars.Lf).ToString();
string cr = (Microsoft.VisualBasic.ControlChars.Cr).ToString();

Just for reference, a more concise and C#-ish way of representing
these characters is '\n' and '\r'.

Eq.
 
C

Chris Saunders

How about the other way around. I wrote an application where I wanted to
make use of
that square character but didn't know where it comes from - I think I know
how to use it
if I knew it's source.

Regards
Chris Saunders
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top