phone number regex

B

Brian Henry

I have phone numbers like this in a data table

123-435-1234
1231231234
432.234.2321

they all have different formatting, what I want to do is get them all
formatted like this

(123) 123-1234

Regex.Replace(Convert.ToString(drRow("Phone")), "(\d{3})(\d{3})(\d{4})",
"($1) $2-$3")

seems to work to do that only when there are no special characters in it...
how would i remove the special characters then do the formatting? thanks!
 
J

Joseph Bittman MCAD

June 22, 2005

Dim number as string = PHONE NUMBER SOURCE

number.replace("-","")
number.replace(" ","")
number.replace(".","")

Now all the numbers will be NNNNNNNNNN...

Now use the substring to get the numbers.... I think the signature is
correct in my example...

number = "(" & number.substring(0,3) & ") " & number.substring("3,3) & "-"
& number.substring(6,4)

If I got the substring sign. right, it should produce (NNN) NNN-NNNN

Hope this helps and have a great day!

--
Joseph Bittman
Microsoft Certified Application Developer

Web Site: http://71.35.110.42
Dynamic IP -- Check here for future changes
 
M

Mythran

Joseph Bittman MCAD said:
June 22, 2005

Dim number as string = PHONE NUMBER SOURCE

number.replace("-","")
number.replace(" ","")
number.replace(".","")

Now all the numbers will be NNNNNNNNNN...

Now use the substring to get the numbers.... I think the signature is
correct in my example...

number = "(" & number.substring(0,3) & ") " & number.substring("3,3) &
"-" & number.substring(6,4)

If I got the substring sign. right, it should produce (NNN) NNN-NNNN

Hope this helps and have a great day!

--
Joseph Bittman
Microsoft Certified Application Developer

Web Site: http://71.35.110.42
Dynamic IP -- Check here for future changes

pattern = "^.*(\d{3}).*(\d{3}).*(\d{4}).*$"

This would allow anything, for example:

abc123def456ghi7890
(123) 456-7890

HTH :)

Mythran
 
M

Mike Labosh

We do something like this: (Note that *we here* do not translate letters to
numbers, as in 1-800-CALL-NOW):

' This is one of my classes that does dozens
' of different kinds of string manipulation
Public Class Transform

Private _cleanPhone As New Regex("\D")

' Other declarations...

' Note that since strings are also immutable, I pass
' the input to this method as ByRef instead of ByVal,]
' for greater speed.
Public Sub CleanPhoneNumber(ByRef value As String)
value = _cleanPhone.Replace(value, "")
End Sub

' Other Methods...

End Class

So in the client, if I do this:

Dim tx As New Transform()
Dim phone As String
' load my data table (dt)

For Each dr As DataRow In dt.Rows
phone = dr("phone").ToString()
phone = tx.CleanPhoneNumber(phone)

' now I have 1234567890
' and then I have line like Joseph's code snip:
dr("phone") = _
"(" & phone.substring(0,3) & ") " & _
phone.substring("3,3) & "-" & _
phone.substring(6,4)

' now I have (123) 456-7890

Next

da.Update(dt) ' give it to a SqlDataAdapter


Note that this format only works with NANP (3 digits)-(7 digits) phone
numbers, as are found in the US, Canada and a few other countries. Here's a
good reference for other international phone numbers: http://www.wtng.info

If you have to deal with international phone numbers, you will have to build
a table of all countries you're interested in, that contains a country code
or name, International Prefix, Trunk Prefix and maybe some other data from
the web site if it concerns you.

I cannot share details because of intellectual property issues, but I spent
4 months last year developing a phone number parsing system compatible with
auto-dialers that can reliably dialy any phone number on the planet *from*
anywhere on the planet.

The above website was my guide to all the telephone mysteries. :)

While I'm here, I will point out that if you use Regex.Replace() directly
without creating an instance of Regex, and I bet you're doing this in a
loop... That's bad.

The static methods of the Regex class incur the overhead of creating and
destroying a Regex instance. This means that if you do something like this:

For Each dr As DataRow In dt.Rows
Regex.Replace(...)
Next

What you're doing is the same as this:

For Each dr As DataRow In dt.Rows
Dim rx As New Regex(...)
rx.Replace(...)
rx = Nothing
Next

So for something like an immutable Regex instance, you should declare an
instance of it and *then* run your loop:

Dim rx As New Regex(...)

For Each dr As DataRow In dt.Rows
rx.Replace(...)
Next

The difference between the first loop and the last loop is that the first
one will create and destroy one instance of Regex per each record in your
DataTable. The last loop, directly above, use only a single instance of
Regex. The difference between the two depends on how many records you're
processing. In our case, the difference was astronomical. When we clean
phone numbers, it could be close to a million at a time. Just imagine what
the previous loop would have done to the Garbage Collector! :)


--
Peace & happy computing,

Mike Labosh, MCSD

"Mr. McKittrick, after very careful consideration, I have
come to the conclusion that this new system SUCKS."
-- General Barringer, "War Games"
 
B

Brian Henry

thanks for the code, but i was looking for a pure regex one like Mythran
provided.


Mike Labosh said:
We do something like this: (Note that *we here* do not translate letters
to numbers, as in 1-800-CALL-NOW):

' This is one of my classes that does dozens
' of different kinds of string manipulation
Public Class Transform

Private _cleanPhone As New Regex("\D")

' Other declarations...

' Note that since strings are also immutable, I pass
' the input to this method as ByRef instead of ByVal,]
' for greater speed.
Public Sub CleanPhoneNumber(ByRef value As String)
value = _cleanPhone.Replace(value, "")
End Sub

' Other Methods...

End Class

So in the client, if I do this:

Dim tx As New Transform()
Dim phone As String
' load my data table (dt)

For Each dr As DataRow In dt.Rows
phone = dr("phone").ToString()
phone = tx.CleanPhoneNumber(phone)

' now I have 1234567890
' and then I have line like Joseph's code snip:
dr("phone") = _
"(" & phone.substring(0,3) & ") " & _
phone.substring("3,3) & "-" & _
phone.substring(6,4)

' now I have (123) 456-7890

Next

da.Update(dt) ' give it to a SqlDataAdapter


Note that this format only works with NANP (3 digits)-(7 digits) phone
numbers, as are found in the US, Canada and a few other countries. Here's
a good reference for other international phone numbers:
http://www.wtng.info

If you have to deal with international phone numbers, you will have to
build a table of all countries you're interested in, that contains a
country code or name, International Prefix, Trunk Prefix and maybe some
other data from the web site if it concerns you.

I cannot share details because of intellectual property issues, but I
spent 4 months last year developing a phone number parsing system
compatible with auto-dialers that can reliably dialy any phone number on
the planet *from* anywhere on the planet.

The above website was my guide to all the telephone mysteries. :)

While I'm here, I will point out that if you use Regex.Replace() directly
without creating an instance of Regex, and I bet you're doing this in a
loop... That's bad.

The static methods of the Regex class incur the overhead of creating and
destroying a Regex instance. This means that if you do something like
this:

For Each dr As DataRow In dt.Rows
Regex.Replace(...)
Next

What you're doing is the same as this:

For Each dr As DataRow In dt.Rows
Dim rx As New Regex(...)
rx.Replace(...)
rx = Nothing
Next

So for something like an immutable Regex instance, you should declare an
instance of it and *then* run your loop:

Dim rx As New Regex(...)

For Each dr As DataRow In dt.Rows
rx.Replace(...)
Next

The difference between the first loop and the last loop is that the first
one will create and destroy one instance of Regex per each record in your
DataTable. The last loop, directly above, use only a single instance of
Regex. The difference between the two depends on how many records you're
processing. In our case, the difference was astronomical. When we clean
phone numbers, it could be close to a million at a time. Just imagine
what the previous loop would have done to the Garbage Collector! :)


--
Peace & happy computing,

Mike Labosh, MCSD

"Mr. McKittrick, after very careful consideration, I have
come to the conclusion that this new system SUCKS."
-- General Barringer, "War Games"
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top