convert string to int

P

Peter K

Hi

I am calling a third-party data-access class which returns a data object to
me which contains a list of data. The items in the list are all of "object"
type, yet I can see they are all really "string" objects, and some of them
are actually string representations of int or decimal values.

For example, I get "56.001" or "787,902" returned in the list.

My problem is converting these int or decimal strings to real int or decimal
values.

How can my application know that "56.001" is a representation of "56 and
1/1000", or a representation of "fifty six thousand and one", for instance?
I take it I need to know how the 3rd-party library is converting or
generating these values?

Well, lets's say I do know that "56.001" is an integer, and represents the
value "fifty six thousand and one" - how do I convert it to an int value?

Thanks,
Peter
 
P

Peter K

Peter K said:
Hi

I am calling a third-party data-access class which returns a data object
to me which contains a list of data. The items in the list are all of
"object" type, yet I can see they are all really "string" objects, and
some of them are actually string representations of int or decimal values.

For example, I get "56.001" or "787,902" returned in the list.

My problem is converting these int or decimal strings to real int or
decimal values.

How can my application know that "56.001" is a representation of "56 and
1/1000", or a representation of "fifty six thousand and one", for
instance? I take it I need to know how the 3rd-party library is converting
or generating these values?

Well, lets's say I do know that "56.001" is an integer, and represents the
value "fifty six thousand and one" - how do I convert it to an int value?

Well, I found the following method. My application is assuming that the DAL
is supply me with data converted using "danish" culture, as this program is
running in an environment for a danish company. But is this the best I can
do - make these sorts of assumptions?

string s = "56.001";

CultureInfo ci = CultureInfo.GetCultureInfo("da-DK");
NumberFormatInfo ni = ci.NumberFormat;
int i = int.Parse(s, NumberStyles.AllowThousands, ci);

Thanks,
Peter
 
J

Joe Cool

"Peter K" <[email protected]> skrev i en meddelelse










Well, I found the following method. My application is assuming that the DAL
is supply me with data converted using "danish" culture, as this program is
running in an environment for a danish company. But is this the best I can
do - make these sorts of assumptions?

            string s = "56.001";

            CultureInfo ci = CultureInfo.GetCultureInfo("da-DK");
            NumberFormatInfo ni = ci.NumberFormat;
            int i = int.Parse(s, NumberStyles.AllowThousands, ci);

Why not just use the Convert.ToInt32() method? Or am I missing
something?
 
A

Abubakar

Joe Cool said:
Why not just use the Convert.ToInt32() method? Or am I missing
something?

he seems to beconverting something which does not look like an integer if u
consider the en-US culture. But if u use the danish culture and than do a
int.Parse, than it works fine. So I think it's the way he should do this.
I've no experience in this (system.globalization), but I just tried his code
and it works fine but if I change the culture it gives exception.

...ab
 
P

Peter K

Abubakar said:
he seems to beconverting something which does not look like an integer if
u consider the en-US culture. But if u use the danish culture and than do
a int.Parse, than it works fine. So I think it's the way he should do
this. I've no experience in this (system.globalization), but I just tried
his code and it works fine but if I change the culture it gives exception.

Yes - it's really irritating that the "library" I am using returns numbers
as objects/strings which are formatted a particular way, instead of
returning me an integer value 56001, or decimal value 787,901 for example.

I guess my application will just have to assume the danish culture, which
uses a '.' as thousands separator and ',' as a decimal point (exactly the
opposite of most, if not all, english speaking countries).
 
A

Arne Vajhøj

Peter said:
Well, I found the following method. My application is assuming that the DAL
is supply me with data converted using "danish" culture, as this program is
running in an environment for a danish company. But is this the best I can
do - make these sorts of assumptions?

string s = "56.001";

CultureInfo ci = CultureInfo.GetCultureInfo("da-DK");
NumberFormatInfo ni = ci.NumberFormat;
int i = int.Parse(s, NumberStyles.AllowThousands, ci);

You are not using ni.

But you basically can:
- specify CultureInfo in the Parse call
- set default CultureInfo before calling Parse

I prefer the explicit specification.

Arne
 
A

Abubakar

The code that you gave I tested, the comma within the numbers gives
exception, only the dot goes through the conversion. So I don't know what
the comma means, at least the dot is getting evaluate as nothing.
 
P

Peter Duniho

Yes - it's really irritating that the "library" I am using returns
numbers
as objects/strings which are formatted a particular way, instead of
returning me an integer value 56001, or decimal value 787,901 for
example.

I guess my application will just have to assume the danish culture, which
uses a '.' as thousands separator and ',' as a decimal point (exactly the
opposite of most, if not all, english speaking countries).

AFAIK, "Danish culture" necessarily is exclusive of "English-speaking
countries". Why _should_ it be expected to comply with the cultural norms
of English-speaking countries?

Beyond that, the use of a period as a thousands separator is certainly not
uncommon.

IMHO, the only real issue here is that, as you've pointed out, you've got
a programmatic API that is returning text formatted for humans to read.
It's bad enough that it's providing a string instead of a binary
representation of an int, but for it to include any thousands separators
at all seems unreasonable.

As far as parsing it goes, hopefully the library is 100% consistent and
always returns the string formatted in a specific culture. If so, the
approach you've found should work fine. If not, you'll need a way to deal
with it. If you can set the culture for the library, maybe you can just
tell it to use the "invariant" culture (which uses, by arbitrary decision,
the US conventions) and then parse using the invariant culture. If not,
then one hopes you can at least query for the culture in use.

Barring that, yet another approach would be to remove all non-digit
characters from the string before parsing. Assuming the string is always
formatted correctly and is always a plain integer (i.e. no decimal point,
only thousands separators), this will work no matter what culture is in
use.

Pete
 
P

Peter K

Arne Vajhøj said:
You are not using ni.

Yeah - I was experimenting with culture-info and related things, and found I
didn't seem to need the NumberFormatInfo, but forgot to remove it from the
snippet I posted.
But you basically can:
- specify CultureInfo in the Parse call
- set default CultureInfo before calling Parse

I prefer the explicit specification.

And that means I do need to know how the string has been formatted by the
3rd party code. I guess it's not really possible to know from a string like
"56.001" what number that represents unless one has knowledge of the
formatting used to generate it - so I will document the fact that my app
assumes a certain culture/formatting (and-or make it configurable, and hope
the 3rd party library doesn't begin to mix up the formatted values it
delivers. I just wish they'd stuck to real values instead of strings).

Thanks,
Peter
 
P

Peter K

Peter Duniho said:
AFAIK, "Danish culture" necessarily is exclusive of "English-speaking
countries". Why _should_ it be expected to comply with the cultural norms
of English-speaking countries?

Indeed - I didn't mean to imply that it should.
Beyond that, the use of a period as a thousands separator is certainly not
uncommon.

No - I think much of Europe actually uses it this way (or a space), and a
comma as a decimal separator.
IMHO, the only real issue here is that, as you've pointed out, you've got
a programmatic API that is returning text formatted for humans to read.
It's bad enough that it's providing a string instead of a binary
representation of an int, but for it to include any thousands separators
at all seems unreasonable.

As far as parsing it goes, hopefully the library is 100% consistent and
always returns the string formatted in a specific culture. If so, the
approach you've found should work fine. If not, you'll need a way to deal
with it. If you can set the culture for the library, maybe you can just
tell it to use the "invariant" culture (which uses, by arbitrary decision,
the US conventions) and then parse using the invariant culture. If not,
then one hopes you can at least query for the culture in use.

I don't think I can set the library culture - but I haven't checked that. I
do know the data comes out of a database, but if it is already stored as
formatted strings there, I don't know. I'll have to assume that the library
is consistent - and tell the customer I'm assuming that (outlining my
reasoning).
Barring that, yet another approach would be to remove all non-digit
characters from the string before parsing. Assuming the string is always
formatted correctly and is always a plain integer (i.e. no decimal point,
only thousands separators), this will work no matter what culture is in
use.

There are also decimal numbers returned as strings, like "787,901". So I
can't simply strip extra characters. I do however know which data is which
type, so I could use something like what you suggest. I'm not sure if a
decimal could be returned like "56.001,901" though.

I'll just document the fact I'm using "danish culture" to convert the
strings the library provides me to real values.
 
A

Arne Vajhøj

Abubakar said:
The code that you gave I tested, the comma within the numbers gives
exception, only the dot goes through the conversion. So I don't know
what the comma means, at least the dot is getting evaluate as nothing.

1000.1 and 1,000.1 means 1000 and 1/10 in English

1000,1 and 1.000,1 means 1000 and 1/10 in Danish

Big difference in semantics of period and comma.

Arne
 
A

Arne Vajhøj

Peter said:
And that means I do need to know how the string has been formatted by the
3rd party code. I guess it's not really possible to know from a string like
"56.001" what number that represents unless one has knowledge of the
formatting used to generate it - so I will document the fact that my app
assumes a certain culture/formatting (and-or make it configurable, and hope
the 3rd party library doesn't begin to mix up the formatted values it
delivers. I just wish they'd stuck to real values instead of strings).

1.234 and 1,234 are valid in both English and Danish, but with
different meaning.

So it is impossible to autodetect unless you know how many decimals
are specified (amount would usually be easy to autodetect).

So documenting the format is mandatory.

Arne
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

convert from string to int 4
convert int to string 5
Convert string to int 8
convert int values to bool 8
Convert string to int 5
converting int to hex int 2
about convert from string to int 2
Decimal to Word 2

Top