How to plot every 2nd, 3rd, etc point

B

Bill Grigg

All,

I did a google on the subject, but I still feel there must be a better
way. Several of the other people that have posted this sort of
question were dealing with tick mark issues. In my case, I have a
situation where I get sets of data with between 20,000 and 200,000
lines of text. In the case where I get over 32,000 lines I only really
want a representative sampling of the data points to plot. Soooo if I
get a set with 40,000 points I am happy to just plot very other point.
One solution I found suggested hiding every other row. This approach
has not worked well for me. Another suggestion was to create a new
column with one good value alternated with NA. This too has not worked
very well. I am more than happy to delete every other row, if that
works OK. I am using Excel 2003

TIA

Bill
 
J

Jon Peltier

Hi Bill -

Another option is to put your data into columns A and B, allow for
headers in A1 and B1. In C2 enter a formula like this:

=OFFSET(A$2,5*(ROW()-2),0)

(the 5 in the formula will make Excel show only every 5th point). Copy
this formula to D2, then drag C2:D2 down as far as necessary.

- Jon
-------
Jon Peltier, Microsoft Excel MVP
Peltier Technical Services
Tutorials and Custom Solutions
http://PeltierTech.com/
_______
 
V

vandenberg p

Here is suggestion that should work, there are many other it would seem.

This example data starts in row 1 and column A.
Assume that the original data is in column A and B starting in row 2.
Create three new columns using the functions given below.
This technique use the index function to create a two new columns of data
x and y data. Notice these two columns
are in D and E. Column C is a helper column that computes the row number
of data you wish to plot, which will be in columns D and E.
The example below used 50 data points so only 6 (5 + 0 point) are ploted.
YOu can play with the plot by changing the value in C1 which
controls the number of data points skipped.

Cell C1 contains 10 in this case so it will plot the 0, 10, 20 etc. value

Row/Col A B C D E
1 - - 10 - -
2 1 10 0 1 10
3 2 20 10 10 100
4 3 30 20 20 200
5 4 40 30 30 300
6 5 50 40 40 400
7 6 60 50 50 500
8 7 70 #N/A #N/A #N/A
9 8 80 #N/A #N/A #N/A
10 9 90 #N/A #N/A #N/A
11 10 100 #N/A #N/A #N/A
12 11 110 #N/A #N/A #N/A
Etc

Cell Formulas:

Cell C1 = Constant (10) (lets call it n)
Cell C2 = Constant (0) The first Cell you want to plot

Here are the formulas for the three columns:

Cell D2 ==INDEX($A$2:$A$51,C2,1)
Cell E2 =INDEX($B$2:$B$51,C2,1)
Cell C3 =IF((+C2+$C$1)>MAX($A$2:$A$51),NA(),C2+$C$1)

These are copied down for as many rows as you need. Since the NA() won't
plot you can copy for your entire data set if you wish, but you only need
the number of rows that would be (number of data rows you have)/n)

Pieter Vandenberg
(e-mail address removed)
-----------------------------------------------
Anti-spam
Please do not reply to this message's address.
If you wish to contact me use the above address.
Thank you.
------------------------------------------------


: All,

: I did a google on the subject, but I still feel there must be a better
: way. Several of the other people that have posted this sort of
: question were dealing with tick mark issues. In my case, I have a
: situation where I get sets of data with between 20,000 and 200,000
: lines of text. In the case where I get over 32,000 lines I only really
: want a representative sampling of the data points to plot. Soooo if I
: get a set with 40,000 points I am happy to just plot very other point.
: One solution I found suggested hiding every other row. This approach
: has not worked well for me. Another suggestion was to create a new
: column with one good value alternated with NA. This too has not worked
: very well. I am more than happy to delete every other row, if that
: works OK. I am using Excel 2003

: TIA

: Bill
 
J

Jon Peltier

vandenberg said:
Here is suggestion that should work, there are many other it would seem.

This example data starts in row 1 and column A.
Assume that the original data is in column A and B starting in row 2.
Create three new columns using the functions given below.
This technique use the index function to create a two new columns of data
x and y data. Notice these two columns
are in D and E. Column C is a helper column that computes the row number
of data you wish to plot, which will be in columns D and E.
The example below used 50 data points so only 6 (5 + 0 point) are ploted.
YOu can play with the plot by changing the value in C1 which
controls the number of data points skipped.

Cell C1 contains 10 in this case so it will plot the 0, 10, 20 etc. value

Row/Col A B C D E
1 - - 10 - -
2 1 10 0 1 10
3 2 20 10 10 100
4 3 30 20 20 200
5 4 40 30 30 300
6 5 50 40 40 400
7 6 60 50 50 500
8 7 70 #N/A #N/A #N/A
9 8 80 #N/A #N/A #N/A
10 9 90 #N/A #N/A #N/A
11 10 100 #N/A #N/A #N/A
12 11 110 #N/A #N/A #N/A
Etc

Cell Formulas:

Cell C1 = Constant (10) (lets call it n)
Cell C2 = Constant (0) The first Cell you want to plot

Here are the formulas for the three columns:

Cell D2 ==INDEX($A$2:$A$51,C2,1)
Cell E2 =INDEX($B$2:$B$51,C2,1)
Cell C3 =IF((+C2+$C$1)>MAX($A$2:$A$51),NA(),C2+$C$1)

These are copied down for as many rows as you need. Since the NA() won't
plot you can copy for your entire data set if you wish, but you only need
the number of rows that would be (number of data rows you have)/n)

Pieter Vandenberg
(e-mail address removed)
-----------------------------------------------

I think you meant Cell *B1* = Constant (10), and keep C1 free of
numerical data.

The one enhancement I'd suggest to this is to define names for the
ranges to be plotted. Press CTRL-F3 and define these:

Name: MyX
Refers To:
=OFFSET($C$2,0,0,COUNT($C:$C),1)

Name: MyY
Refers To:
=OFFSET(MyX,0,1)

These contain only the parts of columns C and D that contain numerical
data. When making the chart, in Step 2 of the Wizard, click on the
Series tab, click Add; for X values, enter =Sheet1!MyX, for Y values
enter =Sheet1!MyY. If the chart is already made, select it and choose
Source Data from the Chart menu, click the Series tab, and enter the
names as above.

The reason I suggest this is that Bill talked of 40k points, which
exceeds the maximum size of an Excel XY chart series. Even if you cut it
off at 30,000, carrying 26,000 #N/A cells in the series is a lot of baggage.

- Jon
-------
Jon Peltier, Microsoft Excel MVP
Peltier Technical Services
Tutorials and Custom Solutions
http://PeltierTech.com/
_______
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top