I said:

You could write the following array formula [*] starting D2 and copy down:

=MAX(IF($B$2:$B$12=B2,IF($C$2:$C$12<>"",$A$2:$A$12))) [....]

Is there a way to do this in vba? I have about

milion rows so it would be much faster I believe.

It probably would not be faster, especially if you use a user-defined

function. However, if you use a subroutine, which you might associate

with a button for ease of use, it would make recalculations and edits

faster

I wrote later:

You could automate that by using a Worksheet_Change

event macro that might invoke the VBA subroutine that

I provided whenever there is a change in columns A, B or C.

However, that would defeat the purpose of the subroutine,

and you would probably be better off with the MAX array

formula, unless the subroutine truly is faster. That

would surprise me

What an odd thing for me to say, on second thought.

The MAX array formula above would access the entire range of data 3 times

__for_each__ instance of the formula. So for "millions" of rows of data,

that would be 3 trillion(!) accesses, not to mention the additional

computations in the formula.

In contrast, the subroutine accesses the entire range data just one time --

a total of 1 million sets of computations for "millions" of rows of data.

The user-defined function below can be used in place of the MAX array

formula. It should be significantly better than the MAX array formula -- on

the order of 1 million sets of computations for "millions" of rows of data.

But it should be slower than the subroutine.

In an actual test with just about 12000 rows of data, the subroutine took

about 0.225 sec, and the UDF took about 1.128 sec for a full recalculation

(ctrl+alt+F9). I did not dare to measure the MAX array formula because of

its combinatorial performance characteristic. (All times are on my

computer. YMMV.)

However, oddly, any edit that causes a recalculation took significantly

longer -- about 94 sec(!). I cannot explain why that is longer than a

forced full recalculation (ctrl+alt+F9), which I confirmed does indeed call

the UDF for all 12000 formulas. And I cannot explain why I cannot duplicate

that long edit-induced recalculation by doing the same thing in a macro.

In any case, my point is: that is 1 to 94 sec for a mere 12000 rows.

Figure about 8000(!) times more for "millions" of rows.

That is why the manual subroutine might still be your best option. Even

then, we are talking about 30 min of computation (on my computer; YMMV).

For what it's worth, my user-defined function is as follows. Once again, I

assume that all data for a company are in adjacent rows, as they are in your

example.

Option Explicit

Function maxDate(date1 As Range, company As Range, date2 As Range)

Dim firstRow As Long, n As Long, myRow As Long

Dim myCompany, d1 As Long, i As Long

Dim maxDate1 As Long ' initially zero

firstRow = date1.Row

If firstRow <> company.Row Or firstRow <> date2.Row _

Then GoTo err

n = date1.Rows.Count

If n <> company.Rows.Count Or n <> date2.Rows.Count _

Then GoTo err

If date1.Columns.Count <> 1 Or company.Columns.Count <> 1 _

Or date2.Columns.Count <> 1 _

Then GoTo err

myRow = Application.Caller.Row - firstRow + 1

If myRow < 1 Or myRow > n Then GoTo err

myCompany = company(myRow)

If date2(myRow) <> "" Then maxDate1 = date1(myRow)

For i = myRow - 1 To 1 Step -1

If company(i) <> myCompany Then Exit For

If date2(i) <> "" Then

d1 = date1(i)

If d1 > maxDate1 Then maxDate1 = d1

End If

Next

For i = myRow + 1 To n

If company(i) <> myCompany Then Exit For

If date2(i) <> "" Then

d1 = date1(i)

If d1 > maxDate1 Then maxDate1 = d1

End If

Next

maxDate = IIf(maxDate1 <> 0, maxDate1, "")

Exit Function

err:

maxDate = CVErr(xlErrValue)

End Function