How can I change 0's in the data set to do a LOG with base 2?

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I'm analyzing a large data set (80 columns, 800 rows) and some of my values
are 0, but I need to find out how to incoporate/exclude them, because I need
to do a Log with a base 2 on each value in all 80 categories. I'm trying to
calculate the Shannon-Wiener Index on each column (if that helps).
 
j_erussell01 said:
I'm analyzing a large data set (80 columns, 800 rows) and some of my
values
are 0, but I need to find out how to incoporate/exclude them, because I
need
to do a Log with a base 2 on each value in all 80 categories. I'm trying
to
calculate the Shannon-Wiener Index on each column (if that helps).

I'm not an expert on Shannon-Wiener Index, but is it not supposed
to use the *number* of individuals in a species relative to the
entire population, not the individual itself?

My understanding is, that you have a column of numbers and
you need to know the total amount of numbers, how many different
numbers are present and how many of each in order to calculate the index.

Please bear with me, if I haven't understood, what you're trying to
accomplish.
 
j_erussell01 said:
I'm analyzing a large data set (80 columns, 800 rows) and some of my values
are 0, but I need to find out how to incoporate/exclude them, because I need
to do a Log with a base 2 on each value in all 80 categories. I'm trying to
calculate the Shannon-Wiener Index on each column (if that helps).

I know nothing about the Shannon-Wiener index beyond what I just read
on a wikipedia.org web page. But my understanding is that it is
computed using LN(), not LOG(...,2).

In either case, your problem is the same. That is, I think you want to
compute -SUM(p*LN(p), i=1,...,S), and you need to avoid the error
caused by evaluating LN(0)

One solution is the following array formula (enter using
ctrl-shift-Enter):

=-sumproduct(A1:CB800, if(A1:CB800<=0, 0, ln(A1:CB800)))

Someone else might be able to provide a SUMPRODUCT() formula that does
not require an array formula. But the IF() function seems necessary to
avoid evaluating LN(0).
 
Back
Top