3D histogram?

J

jchambers13419

I need to create a 3D table from a multitude of data points, similar to the
2D histogram but in 3D. The data is engine data gathered from road testing
of a vehicle in the form of throttle position (TP), engine speed (RPM) and
engine load. Engine load is the value that I'm trying to develop a running
average/profile at the various TP and RPM points. RPM is the x-axis with the
first column being 500 rpm and then incremented to 6000 (see below). TP is
the y-axis starting at 0 and incrementing to full throttle being 750 counts.
Load is calculated by the engine PCM for the instananeous TP and RPM
conditions. I gather tens of thousands of the three point sets while driving
the car at various loads, speeds and acceleration rates. Data is collected
in .csv format. The resulting table is basically 10 rows x 10 columns in
size with:

x-axis (500, 800, 1100, 1400, 1700, 2000, 3000, 4000, 5000, 6000)
y-axis (0, 40, 80, 120, 160, 200, 250, 300, 500, 750)

My problem is how to process the data quickly and easily so that the load
data is sorted into the appropriate cells of the table (RPM, TP) and the
value of each load cell averaged and updated as load values are recorded for
each cell. For example, a small dataset might be

RPM, TP, LOAD
1338, 147, .235
2472, 220, .400
1190, 122, .210
4510, 540, .780
......

Data points 1 and 3 would be cast into the same cell (1100, 120) since the
RPM is between 1000 and 1499 and TP between 120 and 159. The resulting
average load for that cell would then be (.235+.210)/2 = 222.5. Data point 2
would fall into the (2000, 200) cell and point 4 would fall into the (4000,
500) cell.

I haven't been able to figure out a way to take tens of thousands of such
data points and construct the resulting table quickly and easily. This is
something that I have a need to do quite often so I'm desparately seeking
automation to make the task more efficient. Any help would be greatly
appreciated.
 
J

Jon Peltier

This may not be the "best" way to collate the data, but it works.

Put your data in columns A:C as you've laid out. Select the numerical data
in the first column, and name it "dataRPM" (with the range selected, type
dataRPM into the name box, to the left of the formula bar, and press Enter).
Select the numerical data in the second column and name it "dataTP". Select
the data in the third column and name it "dataLOAD".

In F1:p1 enter 0, 40, 80, 120, 160, 200, 250, 300, 500, 750, 1000000 (or
another value which is above the effective max)
In E2:E12 enter 500, 800, 1100, 1400, 1700, 2000, 3000, 4000, 5000, 6000,
1000000 (or another value which is above the effective max)
In F2 enter this array formula:

=SUM((dataRPM>=$E2)*(dataRPM<$E3)*(dataTP>=F$1)*(dataTP<G$1)*dataLOAD)/SUM((dataRPM>=$E2)*(dataRPM<$E3)*(dataTP>=F$1)*(dataTP<G$1))

To make this into an array formula, make sure the cursor is blinking in the
formula, then hold CTRL+SHIFT while pressing Enter. If done properly, Excel
will place curly braces {} around the formula.

Some people don't like array formulas, so a possible non-array alternative
is:

=SUMPRODUCT(--(dataRPM>=$E2)*(dataRPM<$E3)*(dataTP>=F$1)*(dataTP<G$1),dataLOAD)/SUMPRODUCT(--(dataRPM>=$E2)*(dataRPM<$E3)*(dataTP>=F$1)*(dataTP<G$1))

Whatever.

Copy F2, select F2:O11, and paste. It will take a few seconds to update
10000 records (I just tested it), perhaps significantly longer if you're
using Excel 2007.

Don't waste your time trying to make 3D histograms. They will be impossible
to interpret. You have two choices:

(a) Make two charts of this table, one by rows, one by columns. Use line
charts to equally space the parameters along the X axis, or XY charts to
space them according to value. It's according to the value at the bottom of
the range (e.g. 120 for 120<=x<160. You will plot your LOAD against one
factor, with separate curves for each bin of the other factor.

(b) Make a contour/surface plot of the calculated table. See
Surface and Contour Charts in Microsoft Excel
http://pubs.logicalexpressions.com/Pub0009/LPMArticle.asp?ID=447

- Jon
-------
Jon Peltier, Peltier Technical Services, Inc.
http://PeltierTech.com/WordPress/

Advanced Excel Conference - June 17-18 2009 - Charting and Programming
http://peltiertech.com/Training/2009-06-ACNJ/AdvExcelConf200906ACNJ.html
_______
 
J

jchambers13419

Thanks Jon. I'll try this and report back.

Jon Peltier said:
This may not be the "best" way to collate the data, but it works.

Put your data in columns A:C as you've laid out. Select the numerical data
in the first column, and name it "dataRPM" (with the range selected, type
dataRPM into the name box, to the left of the formula bar, and press Enter).
Select the numerical data in the second column and name it "dataTP". Select
the data in the third column and name it "dataLOAD".

In F1:p1 enter 0, 40, 80, 120, 160, 200, 250, 300, 500, 750, 1000000 (or
another value which is above the effective max)
In E2:E12 enter 500, 800, 1100, 1400, 1700, 2000, 3000, 4000, 5000, 6000,
1000000 (or another value which is above the effective max)
In F2 enter this array formula:

=SUM((dataRPM>=$E2)*(dataRPM<$E3)*(dataTP>=F$1)*(dataTP<G$1)*dataLOAD)/SUM((dataRPM>=$E2)*(dataRPM<$E3)*(dataTP>=F$1)*(dataTP<G$1))

To make this into an array formula, make sure the cursor is blinking in the
formula, then hold CTRL+SHIFT while pressing Enter. If done properly, Excel
will place curly braces {} around the formula.

Some people don't like array formulas, so a possible non-array alternative
is:

=SUMPRODUCT(--(dataRPM>=$E2)*(dataRPM<$E3)*(dataTP>=F$1)*(dataTP<G$1),dataLOAD)/SUMPRODUCT(--(dataRPM>=$E2)*(dataRPM<$E3)*(dataTP>=F$1)*(dataTP<G$1))

Whatever.

Copy F2, select F2:O11, and paste. It will take a few seconds to update
10000 records (I just tested it), perhaps significantly longer if you're
using Excel 2007.

Don't waste your time trying to make 3D histograms. They will be impossible
to interpret. You have two choices:

(a) Make two charts of this table, one by rows, one by columns. Use line
charts to equally space the parameters along the X axis, or XY charts to
space them according to value. It's according to the value at the bottom of
the range (e.g. 120 for 120<=x<160. You will plot your LOAD against one
factor, with separate curves for each bin of the other factor.

(b) Make a contour/surface plot of the calculated table. See
Surface and Contour Charts in Microsoft Excel
http://pubs.logicalexpressions.com/Pub0009/LPMArticle.asp?ID=447

- Jon
-------
Jon Peltier, Peltier Technical Services, Inc.
http://PeltierTech.com/WordPress/

Advanced Excel Conference - June 17-18 2009 - Charting and Programming
http://peltiertech.com/Training/2009-06-ACNJ/AdvExcelConf200906ACNJ.html
_______
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top