Converting Dates to 8 digits

J

jermsalerms

I have dates being imported (AS TEXT) to excel in various formats and
want them all to appear in their cells in the same 8 digit format
(MMDDYYYY) so it can be exported to another program

for example:

Single Digit Month and Day
cell A1: 3179 (convert to 03011979)
cell A2: 30179 (convert to 03011979)
cell A3: 030179 (convert to 03011979)
cell A4: 311979 (convert to 03011979)
cell A5: 3011979 (convert to 03011979)
cell A6: 03011979 (already correct)

Single Digit Month and Double Digit Day
cell A7: 52165 (convert to 05211965)
cell A8: 052165 (convert to 05211965)
cell A9: 5211965 (convert to 05211965)
cell A10: 05211965 (already correct)

Double Digit Month and Single Digit Day
cell A11: 12523 (convert to 12051923)
cell A12: 120523 (convert to 12051923)
cell A13: 1251923 (convert to 12051923)
cell A14: 12051923 (already correct)

Double Digit Month and Double Digit Day
cell A15: 112180 (convert to 11211980)
cell A16: 11211980 (already correct)
 
D

Dave Peterson

I don't think you'll ever find a way to do it perfectly.

If all you're doing is looking at numbers, then you can't tell what this should
mean:

12523

Dec 5th, 1923 or Jan 25, 1923.
 
J

jermsalerms

Didn't catch that scenario!


The problem is that I am uploading the spreadsheet as a .csv to a
program that fills in the Date of Birth field and it will only
recognize 8 digit format.

Also it is not something that can be done by hand because I get a list
of 4000+ once a month.

Any suggestions for accuracy on a majority of them
 
D

Dave Peterson

I think I'd add an extra column:

With the data in column A and my extra column B:

=len(a1)
and drag down.

Then I'd sort by this column.

Now I could use data|text to columns for each group of non-ambiguous entries.
(4 digits or 6 digits only???)

Since excel is gonna guess 2023 instead of 1923, you may want to convert the
year using another column:

=date(year(c1)-if(Year(c1)>2006,100,0),month(c1),day(c1))

But I think you're gonna get stuck fixing lots manually.
 
R

Ron Rosenfeld

Didn't catch that scenario!


The problem is that I am uploading the spreadsheet as a .csv to a
program that fills in the Date of Birth field and it will only
recognize 8 digit format.

Also it is not something that can be done by hand because I get a list
of 4000+ once a month.

Any suggestions for accuracy on a majority of them

I think you will need to change the data entry method to produce consistency.
--ron
 
J

jermsalerms

dates are filling in via these different formats in cells A1:A901

MDYY
MMDYY
MDDYY
MMDDYY
MDYYYY
MDDYYYY
MMDDYYYY

I need to convert them all to MMDDYYYY in cell B1:B901

so far no luck figuring this out
 
D

Dave Peterson

As soon as you can specify what "12523" is, then the rules could be applied that
way.

But, again, you won't always get the correct result. They'll look like dates,
but they may not be the correct dates.
 
R

Ron Rosenfeld

dates are filling in via these different formats in cells A1:A901

MDYY
MMDYY
MDDYY
MMDDYY
MDYYYY
MDDYYYY
MMDDYYYY

I need to convert them all to MMDDYYYY in cell B1:B901

so far no luck figuring this out

And you will NOT be able to figure it out because the data is inherently
ambiguous.

I repeat, YOU WILL HAVE TO STANDARDIZE THE DATA ENTRY METHOD.

There is no method of deducing the input format from the resultant number,
given the allowable formats.
--ron
 
P

Pete

There is another 7 digit combination which you have not listed, i.e.:

MMDYYYY

I have been trying to develop formulae for this, returning 8 digits
only when there is no uncertainty and otherwise returning "check".
There are too many IFs to fit into one formula, and I might have to
split it into 3 or 4 to catch all uncertainties. I have assumed, as
these are dates of birth, that all 2-digit years will be 19xx.

It's getting late here in the UK (2:10 am), so I'll have another look
at it tomorrow evening.

Pete
 
J

jermsalerms

unfortunately this data is coming from a third party provider and the
will not standardize their data...its always got to be hard.

I was thinking of making a few assumptions with the dates that could
have different combinations...such as 12573....(if 1st two numbers are
10,11, or 12...then assume that that is the month).

At least that way I can get maybe about 85% of them in correctly and
just confirm these birthdates with the client when we speak with them.


Yes these are all 19xx years.
 
P

Pete

OK, I've been waiting for you to get back to me. I'll post what I've
got so far later on (in about 4 hours - have to do some work now).

By the way, any 6 digit date is ambiguous if it is xx19xx, as this
could be MMDDYY or MDYYYY - do you have any suggestions/preferences for
this?

I have assumed no leading zeroes, so if the first digit is greater than
1 then the date format must be MDxx_whatever, and if second digit is 0
(with first being 1), this must be MMxx_whatever with MM=10.

Pete
 
P

Pete

I've assumed that your dates are in A1 (down) of sheet1, and you want
the (composite) response in B1. Insert a new sheet (Sheet2) and enter
these formulae:

A1:
=IF(LEFT(Sheet1!A1,1)="0",RIGHT(Sheet1!A1,LEN(Sheet1!A1)-1),Sheet1!A1)

This strips any leading zeroes from your dates if they are text.

C1:
=IF(LEN(A1)=4,"0"&LEFT(A1,1)&"0"&MID(A1,2,1)&"19"&RIGHT(A1,2),IF(LEN(A1)=8,A1,""))

This checks for 4-digit or 8-digit dates - these are unambiguous.

D1:
=IF(LEN(A1)<>5,"",IF(VALUE(MID(A1,2,1))>2,"0"&LEFT(A1,3)&"19"&RIGHT(A1,2),
IF(VALUE(LEFT(A1,1))>1,"0"&LEFT(A1,3)&"19"&RIGHT(A1,2),"check")))

Column D checks for 5-digit dates - I've broken the formula to avoid
awkward line-breaks.

E1:
=IF(LEN(A1)<>6,"",IF(MID(A1,3,2)<>"19",LEFT(A1,4)&"19"&RIGHT(A1,2),
IF(VALUE(LEFT(A1,1))>1,"0"&LEFT(A1,1)&"0"&RIGHT(A1,5),"check")))

Column E is for 6-digit dates.

F1: =IF(LEN(A1)<>7,"",IF(VALUE(MID(A1,2,1))>2,"0"&A1,
IF(LEFT(A1,2)="10",LEFT(A1,2)&"0"&RIGHT(A1,5),
IF(VALUE(LEFT(A1,1))>1,"0"&A1,"check"))))

Column F checks 7-digit dates. I have not built in any checks for <
4-digit or > 8-digit. Finally in this sheet:

B1: =C1&D1&E1&F1

This holds the appropriate output. I would suggest that you use
conditional formatting on this - if Cell Contents, Equal To, "check"
then change the background colour to something gawdy. All these
formulae can be copied down for as many dates as you have in sheet1.
One final formula in sheet1 cell B1:

=""&Sheet2!B1

I would suggest that you work on Sheet2, examining any values of
"check" in column B and entering your own interpretation of the date as
applicable - you should enter an apostrophe first in order to preserve
any leading zeroes (the output is text). Then when you have all the
dates in the correct format, you can fix the values in column B of
sheet1.

There may be a few more unambiguous combinations which others can
identify, and these can be built into the formulae quite easily.

Hope this helps to reduce your workload.

Pete
 
P

Pete

Incidentally, this is what you get with your sample data:

A B
3179 03011979
30179 03011979
52165 05211965
12523 check
030179 03011979
311979 03011979
052165 05211965
120523 12051923
112180 11211980
3011979 03011979
5211965 05211965
1251923 check
03011979 03011979
05211965 05211965
12051923 12051923
11211980 11211980

Pete
 
J

jermsalerms

tested against 900 records and only 20 resulted with "check"...98
accuracy.

I can't believe you were able to do this.

Hats off to you!!
 
P

Pete

Thanks for the feedback. I was wondering how you had got on with it,
and what kind of rejection rate you would get - in your earlier
postings you were suggesting that 85% conversion would be reasonable,
but I'm surprised (and happy for you) that it is so high. I presume you
have sampled the output to check for the accuracy of the conversion?

With 4000 or so dates per month to convert, it shouldn't take too long
to check 80 of them manually.

Pete
 
J

jermsalerms

Yes I am very impressed....I ran down the list for quit awhile to make
sure the dates were converting correctly and they seem to be working
correctly.
 
P

Pete

Good, I'm glad.

I thought of another certainty earlier - in a 5-digit date which has
zero as the middle digit, this must be MDDYY. I haven't coded that, so
that might reduce the uncertainties even more. Do you fancy having a go
? (It's getting late here in the UK, so time for bed).

Pete
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top