Pulling cells from multiple reports into a new column

P

poalman

Hi,
I've been search groups and have spent a day now trying to code this
myself but I'm not getting very far due to my lack of VBA knowledge.

I have a problem in that we have an excel spreadsheet that contains a
lot of information, a spreadsheet with around 50 columns and over
20000 rows. I've been asked to add an aditional column to these 20000
rows with the implementation date- (Each row has a unique item ID). I
have obtained the dates, and they are all contained in reports (theres
around 50 of them in total), which are also excel spreadsheets, these
spreadsheets contain the corresponding item ID, but there are multiple
lines with the same item ID (different dates for various mile stones),
where this happens i need the earliest date (this is always the
implementation date).

So what I'd ideally like to be able to achieve is run a macro that
could jump into each of the reports, find the rows with the
corresponding item ID, get the earliest date in the date column,
create a new column on the main spreadsheet and copy this date into
it, against the respective item ID.

Is this an achievable solution with VBA?

Many Thanks!!
Paul.
 
N

Nigel

The simple answer is Yes it can be coded in VBA.

But the approach will depend on if the source of the ID is known or if the
ID can be in more than one report?

Option 1
If you know that any one ID in your master worksheet is in only one report
then the simplest approach would be to scan all rows in the master, open the
relevant source report worksheet, scan that looking for each ID and keep a
track of the oldest date, transfer the oldest date to your master workbook
and close the source report. Repeat for each row in the master.

Option 2
If however the ID can be in many reports and or you do not know the source,
it might be better to summarise each source report keeping the oldest ID in
each, then combine all these into one composite table, and then process this
into a single ID with the oldest date. Use this new list as a lookup for
your master ID worksheet.

Knowing your exact circumstances will help define the solution.
 
P

poalman

The simple answer is Yes it can be coded in VBA.

But the approach will depend on if the source of the ID is known or if the
ID can be in more than one report?

Option 1
If you know that any one ID in your master worksheet is in only one report
then the simplest approach would be to scan all rows in the master, open the
relevant source report worksheet, scan that looking for each ID and keep a
track of the oldest date, transfer the oldest date to your master workbook
and close the source report.  Repeat for each row in the master.

Option 2
If however the ID can be in many reports and or you do not know the source,
it might be better to summarise each source report keeping the oldest ID in
each, then combine all these into one composite table, and then process this
into a single ID with the oldest date.  Use this new list as a lookup for
your master ID worksheet.

Knowing your exact circumstances will help define the solution.

--

Regards,
Nigel
(e-mail address removed)










- Show quoted text -

Hi Nigel,
Thank you for your speedy response!!
Each item ID only has fields in 1 of the reports, the reports are
called nx000000.xls, nx000500.xls, nx001000.xls ect so for instance
the first spreadsheet has information on items 000001 - 000499, the
next 000500 - 000999 ect.
I could combine them into 1 large spreadsheet but as there is around
16000 rows in each report, it wouldn't all fit into 1 spreadsheet so
there doesn't seem like any point in combining them at all (vba
solution that ran through 10 large spreadsheets would be the same as
one running through 50 smaller one).
The additional column needs to contain the earliest date, but I'm
guessing that it would almost be identical to a solution that picked
out the oldest date.

Thanks!!
Paul.
 
G

Geoff K

Producing code for static results like you are suggesting is always a balance
between time spent to write the code or simply "getting on with it manually".
But to be constructive one algorthm would be:

Create a new "Extract" wsheet in the master wbook.
Copy paste the id and date columns from each "Report" wboook onto it.
Sort each pair of columns in ascending order.
Create an array for results and for each id row on the Summary wsheet do a
Find on the Extract wsheet (Find will only return the first result).
Copy the array results into your date column on the Summary wsheet.

Come back if you need assistance to do this.

hth

Geoff K
 
N

Nigel

OK .....

1. The ID is only in one source workbook, and you can identify which sheet
to refer, but the ID is intelligent (not good practice) and not to be relied
upon.
2. The earliest/oldest date are will be the same - why do you suspect this
would not be identical?
3. I proposed to summarise each of the 50 or so books so they contained
only one entry for the ID and the oldest/earliest date. Then combining these
would result in a single workbook of ~20,000 rows. Which would process a lot
faster!

I recommend the last approach. The following code will sequentially open
each of the report workbooks. The method I would use to summarise each
report book is a combination of sort on ID by Date (ascending), extract the
first instance of the ID and related date, copy it to the next row in the
lookup workbook, advance to next row in the lookup table, repeat until end
of the report book. Process Next report book .

Finally save the lookup (ID_Lookup.xls), this will have all IDs and the
oldest date in it. You can now use this to map into your master workbook.

Note: I have assumed report books have ID in column A and Date in Column B,
change the code below to suit what you have, also I copy column A&B to the
lookup, yours will need to be changed to suit.


Sub CombineWbks()

Dim sFilePath As String, sFileName As String
Dim wbLookUp, wbFile As Workbook
Dim iFile As Integer, iFNum
Dim lLookUpRow As Long, lLastRow As Long, lRow As Long

sFilePath = "D:\" ' set this to path for files

Set wbLookUp = Workbooks.Add

lLookUpRow = 1
For iFile = 0 To 50

' buildfile name
iFNum = CStr(iFile * 500)
sFileName = sFilePath & "\" & "nx" & String(6 - Len(iFNum), "0") & iFNum
& "xls"

'test if file exists
If Dir(sFileName) <> "" Then

Set wbFile = Workbooks.Open(sFileName)

' process the file
With wbFile.Sheets("Sheet1")
' get last row
lLastRow = .Cells(.Rows.Count, "A").End(xlUp).Row

' sort the data (assumes 2 columns A= ID; B = Date)
.Range(.Cells(2, "A"), .Cells(lLastRow, "B")).Sort _
key1:=.Range("A2"), Order1:=xlAscending, _
key2:=.Range("B2"), Order2:=xlAscending

' scan file for oldest id/date and transfer to lookup
For lRow = 2 To lLastRow
' check lag by 1 record for change in ID
If .Cells(lRow - 1, "A") <> .Cells(lRow, "A") Then
.Range(.Cells(lRow, "A"), .Cells(lRow, "B")).Copy _
Destination:=wbLookUp.Sheets(1).Cells(lLookUpRow, 1)

' next row in lookup table
lLookUpRow = lLookUpRow + 1

End If
Next

End With

wbFile.Close savechanges:=False

End If

Next iFile

wbLookUp.SaveAs Filename:=sFilePath & "ID_LookUp.xls"


End Sub

--

Regards,
Nigel
(e-mail address removed)



The simple answer is Yes it can be coded in VBA.

But the approach will depend on if the source of the ID is known or if the
ID can be in more than one report?

Option 1
If you know that any one ID in your master worksheet is in only one report
then the simplest approach would be to scan all rows in the master, open
the
relevant source report worksheet, scan that looking for each ID and keep a
track of the oldest date, transfer the oldest date to your master workbook
and close the source report. Repeat for each row in the master.

Option 2
If however the ID can be in many reports and or you do not know the
source,
it might be better to summarise each source report keeping the oldest ID
in
each, then combine all these into one composite table, and then process
this
into a single ID with the oldest date. Use this new list as a lookup for
your master ID worksheet.

Knowing your exact circumstances will help define the solution.

--

Regards,
Nigel
(e-mail address removed)










- Show quoted text -

Hi Nigel,
Thank you for your speedy response!!
Each item ID only has fields in 1 of the reports, the reports are
called nx000000.xls, nx000500.xls, nx001000.xls ect so for instance
the first spreadsheet has information on items 000001 - 000499, the
next 000500 - 000999 ect.
I could combine them into 1 large spreadsheet but as there is around
16000 rows in each report, it wouldn't all fit into 1 spreadsheet so
there doesn't seem like any point in combining them at all (vba
solution that ran through 10 large spreadsheets would be the same as
one running through 50 smaller one).
The additional column needs to contain the earliest date, but I'm
guessing that it would almost be identical to a solution that picked
out the oldest date.

Thanks!!
Paul.
 
P

poalman

OK .....

1. The ID is only in one source workbook, and you can identify which sheet
to refer, but the ID is intelligent (not good practice) and not to be relied
upon.
2. The earliest/oldest date are will be the same - why do you suspect this
would not be identical?
3. I proposed to summarise each of the  50 or so books so they contained
only one entry for the ID and the oldest/earliest date. Then combining these
would result in a single workbook of ~20,000 rows. Which would process a lot
faster!

I recommend the last approach.  The following code will sequentially open
each of the report workbooks.  The method I would use to summarise each
report book is a combination of sort on ID by Date  (ascending), extractthe
first instance of the ID and related date, copy it to the next row in the
lookup workbook, advance to next row in the lookup table, repeat until end
of the report book. Process Next report book .

Finally save the lookup (ID_Lookup.xls), this will have all IDs and the
oldest date in it.  You can now use this to map into your master workbook.

Note: I have assumed report books have ID in column A and Date in Column B,
change the code below to suit what you have, also I copy column A&B to the
lookup, yours will need to be changed to suit.

Sub CombineWbks()

Dim sFilePath As String, sFileName As String
Dim wbLookUp, wbFile As Workbook
Dim iFile As Integer, iFNum
Dim lLookUpRow As Long, lLastRow As Long, lRow As Long

sFilePath = "D:\" ' set this to path for files

Set wbLookUp = Workbooks.Add

lLookUpRow = 1
For iFile = 0 To 50

   ' buildfile name
   iFNum = CStr(iFile * 500)
   sFileName = sFilePath & "\" & "nx" & String(6 - Len(iFNum), "0") & iFNum
& "xls"

   'test if file exists
   If Dir(sFileName) <> "" Then

      Set wbFile = Workbooks.Open(sFileName)

      ' process the file
      With wbFile.Sheets("Sheet1")
        ' get last row
        lLastRow = .Cells(.Rows.Count, "A").End(xlUp).Row

        ' sort the data (assumes 2 columns A= ID; B = Date)
        .Range(.Cells(2, "A"), .Cells(lLastRow, "B")).Sort _
        key1:=.Range("A2"), Order1:=xlAscending, _
        key2:=.Range("B2"), Order2:=xlAscending

        ' scan file for oldest id/date and transfer to lookup
        For lRow = 2 To lLastRow
           ' check lag by 1 record for change in ID
           If .Cells(lRow - 1, "A") <> .Cells(lRow, "A") Then
              .Range(.Cells(lRow, "A"), .Cells(lRow, "B")).Copy _
              Destination:=wbLookUp.Sheets(1).Cells(lLookUpRow, 1)

              ' next row in lookup table
              lLookUpRow = lLookUpRow + 1

           End If
        Next

      End With

      wbFile.Close savechanges:=False

   End If

Next iFile

wbLookUp.SaveAs Filename:=sFilePath & "ID_LookUp.xls"

End Sub

--

Regards,
Nigel
(e-mail address removed)









Hi Nigel,
Thank you for your speedy response!!
Each item ID only has fields in 1 of the reports, the reports are
called nx000000.xls, nx000500.xls, nx001000.xls ect so for instance
the first spreadsheet has information on items 000001 - 000499, the
next 000500 - 000999 ect.
I could combine them into 1 large spreadsheet but as there is around
16000 rows in each report, it wouldn't all fit into 1 spreadsheet so
there doesn't seem like any point in combining them at all (vba
solution that ran through 10 large spreadsheets would be the same as
one running through 50 smaller one).
The additional column needs to contain the earliest date, but I'm
guessing that it would almost be identical to a solution that picked
out the oldest date.

Thanks!!
Paul.- Hide quoted text -

- Show quoted text -

Thanks for your reply!
I will toy with this code as it looks very promising!!
A couple of potential problems that I've notice though, I have just
realised the date column that I need to extract is stored as a string
in the format dd/mm/yyyy hh:mm:ss:msmsms. as its a string if you sort
on date it just orders them as if it were a number, so 01/02/2005
orders itself before 02/02/2004. I'm guessing I can put in a column
with something like =RIGHT(LEFT(B2,10),4)&"/"&RIGHT(LEFT(B2,5),
2)&"/"&LEFT(B2,2) just so I can sort accurately on that unless there
is a better way?

Also the reports arn't always in groups of 500, some are 1000 and
others are 100. I'm not entirely sure the reason for this but it is
something to do with the system that we pull the extracts from.

Yes I was having a funny moment earlier :) ofcourse oldest and
earliest mean the exact same thing!!!

I presume this macro would go in a new blank workbook?

I'll start to tinker with it now!
Thanks again!!
Paul.
 
N

Nigel

Sorting the dates as text can be adjusted by adding to the sort function

DataOption2:=xlSortTextAsNumbers

As long as the filenames increment by '500' as you define it, the size of
the file and numbers of IDs in each does not matter. If the filenames are
not sequential, then you will need to change the file reading process to
scan the directory looking for any file beginning 'nx' - if that is
appropriate?

Use Dir to do this.... e.g.

sfile = Dir(sFilePath & "nx*.xls")
Do While sfile <> ""

' process file called sFile

sfile = Dir
Loop


As written use a blank workbook, or you could put it into the IDLookUp.xls,
in which case change the line

Set wbLookUp = Workbooks.Add

to

Set wbLookUp = ActiveWorkbook


HTH



--

Regards,
Nigel
(e-mail address removed)



OK .....

1. The ID is only in one source workbook, and you can identify which sheet
to refer, but the ID is intelligent (not good practice) and not to be
relied
upon.
2. The earliest/oldest date are will be the same - why do you suspect this
would not be identical?
3. I proposed to summarise each of the 50 or so books so they contained
only one entry for the ID and the oldest/earliest date. Then combining
these
would result in a single workbook of ~20,000 rows. Which would process a
lot
faster!

I recommend the last approach. The following code will sequentially open
each of the report workbooks. The method I would use to summarise each
report book is a combination of sort on ID by Date (ascending), extract
the
first instance of the ID and related date, copy it to the next row in the
lookup workbook, advance to next row in the lookup table, repeat until end
of the report book. Process Next report book .

Finally save the lookup (ID_Lookup.xls), this will have all IDs and the
oldest date in it. You can now use this to map into your master workbook.

Note: I have assumed report books have ID in column A and Date in Column
B,
change the code below to suit what you have, also I copy column A&B to the
lookup, yours will need to be changed to suit.

Sub CombineWbks()

Dim sFilePath As String, sFileName As String
Dim wbLookUp, wbFile As Workbook
Dim iFile As Integer, iFNum
Dim lLookUpRow As Long, lLastRow As Long, lRow As Long

sFilePath = "D:\" ' set this to path for files

Set wbLookUp = Workbooks.Add

lLookUpRow = 1
For iFile = 0 To 50

' buildfile name
iFNum = CStr(iFile * 500)
sFileName = sFilePath & "\" & "nx" & String(6 - Len(iFNum), "0") & iFNum
& "xls"

'test if file exists
If Dir(sFileName) <> "" Then

Set wbFile = Workbooks.Open(sFileName)

' process the file
With wbFile.Sheets("Sheet1")
' get last row
lLastRow = .Cells(.Rows.Count, "A").End(xlUp).Row

' sort the data (assumes 2 columns A= ID; B = Date)
.Range(.Cells(2, "A"), .Cells(lLastRow, "B")).Sort _
key1:=.Range("A2"), Order1:=xlAscending, _
key2:=.Range("B2"), Order2:=xlAscending

' scan file for oldest id/date and transfer to lookup
For lRow = 2 To lLastRow
' check lag by 1 record for change in ID
If .Cells(lRow - 1, "A") <> .Cells(lRow, "A") Then
.Range(.Cells(lRow, "A"), .Cells(lRow, "B")).Copy _
Destination:=wbLookUp.Sheets(1).Cells(lLookUpRow, 1)

' next row in lookup table
lLookUpRow = lLookUpRow + 1

End If
Next

End With

wbFile.Close savechanges:=False

End If

Next iFile

wbLookUp.SaveAs Filename:=sFilePath & "ID_LookUp.xls"

End Sub

--

Regards,
Nigel
(e-mail address removed)









Hi Nigel,
Thank you for your speedy response!!
Each item ID only has fields in 1 of the reports, the reports are
called nx000000.xls, nx000500.xls, nx001000.xls ect so for instance
the first spreadsheet has information on items 000001 - 000499, the
next 000500 - 000999 ect.
I could combine them into 1 large spreadsheet but as there is around
16000 rows in each report, it wouldn't all fit into 1 spreadsheet so
there doesn't seem like any point in combining them at all (vba
solution that ran through 10 large spreadsheets would be the same as
one running through 50 smaller one).
The additional column needs to contain the earliest date, but I'm
guessing that it would almost be identical to a solution that picked
out the oldest date.

Thanks!!
Paul.- Hide quoted text -

- Show quoted text -

Thanks for your reply!
I will toy with this code as it looks very promising!!
A couple of potential problems that I've notice though, I have just
realised the date column that I need to extract is stored as a string
in the format dd/mm/yyyy hh:mm:ss:msmsms. as its a string if you sort
on date it just orders them as if it were a number, so 01/02/2005
orders itself before 02/02/2004. I'm guessing I can put in a column
with something like =RIGHT(LEFT(B2,10),4)&"/"&RIGHT(LEFT(B2,5),
2)&"/"&LEFT(B2,2) just so I can sort accurately on that unless there
is a better way?

Also the reports arn't always in groups of 500, some are 1000 and
others are 100. I'm not entirely sure the reason for this but it is
something to do with the system that we pull the extracts from.

Yes I was having a funny moment earlier :) ofcourse oldest and
earliest mean the exact same thing!!!

I presume this macro would go in a new blank workbook?

I'll start to tinker with it now!
Thanks again!!
Paul.
 
P

poalman

Sorting the dates as text can be adjusted by adding to the sort function

DataOption2:=xlSortTextAsNumbers

As  long as the filenames increment by '500' as you define it, the sizeof
the file and numbers of IDs in each does not matter.   If the filenamesare
not sequential, then you will need to change the file reading process to
scan the directory looking for any file beginning 'nx' - if that is
appropriate?

Use Dir to do this.... e.g.

sfile = Dir(sFilePath & "nx*.xls")
Do While sfile <> ""

   ' process file called sFile

   sfile = Dir
Loop

As written use a blank workbook, or you could put it into the IDLookUp.xls,
in which case change the line

Set wbLookUp = Workbooks.Add

to

Set wbLookUp = ActiveWorkbook

HTH

--

Regards,
Nigel
(e-mail address removed)































Thanks for your reply!
I will toy with this code as it looks very promising!!
A couple of potential problems that I've notice though, I have just
realised the date column that I need to extract is stored as a string
in the format dd/mm/yyyy hh:mm:ss:msmsms. as its a string if you sort
on date it just orders them as if it were a number, so 01/02/2005
orders itself before 02/02/2004. I'm guessing I can put in a column
with something like =RIGHT(LEFT(B2,10),4)&"/"&RIGHT(LEFT(B2,5),
2)&"/"&LEFT(B2,2) just so I can sort accurately on that unless there
is a better way?

Also the reports arn't always in groups of 500, some are 1000 and
others are 100. I'm not entirely sure the reason for this but it is
something to do with the system that we pull the extracts from.

Yes I was having a funny moment earlier :) ofcourse oldest and
earliest mean the exact same thing!!!

I presume this macro would go in a new blank workbook?

I'll start to tinker with it now!
Thanks again!!
Paul.- Hide quoted text -

- Show quoted text -

I tried sorting text as numbers but this sorts it in priority of day
then month then year, I don't suppose there's a functions sort text as
date? or will I be forced to use the "=RIGHT(LEFT(B2,10),
4)&"/"&RIGHT(LEFT(B2,5),2)&"/"&LEFT(B2,2)" cell to put it in year
month day order.

I changed the line to iFNum = CStr(iFile * 100) this then opens any
file and doesn't complain if it doesn't exist. for instand there's
nx021000.xls, nx21100.xls, nx021400.xls, nx021600.xls, nx022000.xls.
All the reports are in multiples of 100 so that works brilliantly!

Thanks!
Paul.
 
P

poalman

Sorting the dates as text can be adjusted by adding to the sort function

DataOption2:=xlSortTextAsNumbers

As  long as the filenames increment by '500' as you define it, the sizeof
the file and numbers of IDs in each does not matter.   If the filenamesare
not sequential, then you will need to change the file reading process to
scan the directory looking for any file beginning 'nx' - if that is
appropriate?

Use Dir to do this.... e.g.

sfile = Dir(sFilePath & "nx*.xls")
Do While sfile <> ""

   ' process file called sFile

   sfile = Dir
Loop

As written use a blank workbook, or you could put it into the IDLookUp.xls,
in which case change the line

Set wbLookUp = Workbooks.Add

to

Set wbLookUp = ActiveWorkbook

HTH

--

Regards,
Nigel
(e-mail address removed)































Thanks for your reply!
I will toy with this code as it looks very promising!!
A couple of potential problems that I've notice though, I have just
realised the date column that I need to extract is stored as a string
in the format dd/mm/yyyy hh:mm:ss:msmsms. as its a string if you sort
on date it just orders them as if it were a number, so 01/02/2005
orders itself before 02/02/2004. I'm guessing I can put in a column
with something like =RIGHT(LEFT(B2,10),4)&"/"&RIGHT(LEFT(B2,5),
2)&"/"&LEFT(B2,2) just so I can sort accurately on that unless there
is a better way?

Also the reports arn't always in groups of 500, some are 1000 and
others are 100. I'm not entirely sure the reason for this but it is
something to do with the system that we pull the extracts from.

Yes I was having a funny moment earlier :) ofcourse oldest and
earliest mean the exact same thing!!!

I presume this macro would go in a new blank workbook?

I'll start to tinker with it now!
Thanks again!!
Paul.- Hide quoted text -

- Show quoted text -

Got this working, however - the actual formula "=RIGHT(LEFT(B2,10),
4)&"/"&RIGHT(LEFT(B2,5),2)&"/"&LEFT(B2,2)" is being copied. How would
I get the code to paste special > values, so i can just get the date?

Sorry for firing all these questions!!

Just got the challenge of bolting this onto the main spreadsheet
now :)

Paul.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top