Need to Sum one column for each row that is part of a group and then filter out dups

K

kukarooza

I have around 53,000 rows of data with columns A-I. The first five rows
for example have data for a particular student... STUDENT A. Rows 6
through 15 have data about another student... STUDENT B, and so forth.


Each column per student has the same data in it with the exception of
the last column, column I. Column I has the students GPA (Grade Point
Averages) in it per row.

So, for example, STUDENT A might have a GPA in row 1 of 4.0, a GPA in
row 2 of 2.5, a GPA in row 3 of 4.0, a GPA in row 4 of 3.0, and a GPA
in row 5 of 4.0.

Since all data in the rows for each person has the same data in them
(with the exception of column 5), I want to get rid of the duplicate
rows per person. I want the one row that's left to represent each
person to show an average GPA of all the GPA's that were listed for
that particular student, before the duplicates were removed.

For example,

STUDENT 1
Row 1 GPA (in column I) is 4.0
Row 2 GPA (in column I) is 2.5
Row 3 GPA (in column I) is 4.0
Row 4 GPA (in column I) is 3.0
Row 5 GPA (in column I) is 4.0

I need to remove filter out rows 2-5 and have only one row representing
the student. In that row, (in column I) I need the number for GPA to
be an average of what was listed in I1-I5... or 3.5 in this example.

When I filter out duplicate data, I'm currently getting the one row I'm
looking for, but that row only lists the 4.0 number that was listed in
the first column of the student group of rows.

Is it possible to filter out duplicate rows and place a formula in the
row that's left for each student that will give the average of all the
rows that were originally showing for that person?

Thanks in advance!
 
D

Dave Peterson

I think I'd try a pivottable.

First, insert a new column A (shift everything to the right one column)

Then insert headers in row 1.

Then in A2, put a formula like:

=B2&"|"&C2&"|"&D2&"|"&E2&"|"&F2&"|"&G2&"|"&H2&"|"&I2

I used the vertical bar to isolate fields. If your data has any vertical bars
in it, choose a different (unique character)--we're going to "decombine" it
using this character.

Drag this formula down all 53k rows.

Now select your data (A1:J53000, (it shifted over one)).

Data|Pivottables
Follow the wizard until you get to a dialog with a Layout button in the bottom
left.

Click that layout button
Drag the header for column A to the Row field.
drag the header for column J to the Data Field.
When you let go of that header like button, double click on it and choose
Average
(It may have said Sum or Count--depending what was in that column)

Ok and finish.

Now you've got a column that has the concatenated information in it and a column
that has the average in it.

Convert it to values:
ctrl-a (twice with xl2003)
edit|copy
edit|paste special|Values

move column B to the left of column A (just select the whole column B and
Edit|Cut, then select column A and edit|insert cut cells.

(almost done)

Now delete rows 1-3 (if required). Just keep one header row.

Select column B (that has all the concatenated values)
Data|Text to columns
delimited by vertical bar (or whatever unique character you used!)
and finish it up.

You may have to format the columns (like dates/percentages/numbers), but that
shouldn't take too long.

====
The bad news. Newer versions of excel handle lots of data better in pivottables
than earlier versions.

Save your work (as a different workbook) whenever you decide you don't want to
do it again!

And if your version of excel doesn't handle all the data, maybe you could split
it up into two (or more) worksheets (don't split any one person across multiple
sheets) and do this a couple (or more) times.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top