Author Topic: Question on using Chi-square distribution for heterogeneity calculation  (Read 3854 times)

Gian Colombo

  • Post Doc
  • ***
  • Posts: 11
This question is specifically related to John Donovan's very useful lecture notes and pdf's that are posted in the "EPMA (and SEM) Education and Training" area of the board.  http://probesoftware.com/smf/index.php?topic=562.0

The chapter 7 lecture notes on statistics cover a procedure for calculating heterogeneity limits using a Chi-square distribution.  I have attached a screenshot of the page of interest (page 22).  The table on this page lists chi-square values for different levels of probability and different degrees of freedom, which are then used to construct a confidence interval.  The example in the text uses 99 degrees of freedom and a 99% confidence level, which according to the table gives chi-square values of 0.6693 and 1.3600.

I have been trying to create a spreadsheet for general use to run this calculation, but I have not been able to duplicate the numbers listed in the table.  Excel, Minitab, and multiple other sources with chi-square tables give chi-square values of 69.23 and 134.6 for the left tail and right tail area (probability), respectively.  If I use two-tails that add up to 1% probability outside of the chi-square limits, then I get chi-square values of 66.51 and 139.0.

Does anyone have an idea where the difference is coming from?

Thanks for the help,
Gian

Probeman

  • Emeritus
  • *****
  • Posts: 2858
  • Never sleeps...
    • John Donovan
Re: Question on using Chi-square distribution for heterogeneity calculation
« Reply #1 on: November 02, 2015, 10:54:51 AM »
This question is specifically related to John Donovan's very useful lecture notes and pdf's that are posted in the "EPMA (and SEM) Education and Training" area of the board.  http://probesoftware.com/smf/index.php?topic=562.0

The chapter 7 lecture notes on statistics cover a procedure for calculating heterogeneity limits using a Chi-square distribution.  I have attached a screenshot of the page of interest (page 22).  The table on this page lists chi-square values for different levels of probability and different degrees of freedom, which are then used to construct a confidence interval.  The example in the text uses 99 degrees of freedom and a 99% confidence level, which according to the table gives chi-square values of 0.6693 and 1.3600.

I have been trying to create a spreadsheet for general use to run this calculation, but I have not been able to duplicate the numbers listed in the table.  Excel, Minitab, and multiple other sources with chi-square tables give chi-square values of 69.23 and 134.6 for the left tail and right tail area (probability), respectively.  If I use two-tails that add up to 1% probability outside of the chi-square limits, then I get chi-square values of 66.51 and 139.0.

Does anyone have an idea where the difference is coming from?

Thanks for the help,
Gian

Hi Gian,
I should also clarify that this chapter by written by my predecessors Jack Rice and Michael Shaffer. Jack is retired and Mike is a consultant, maybe retired now.  I will ping them to see if they can respond.
john
The only stupid question is the one not asked!

michael shaffer

  • Guest
Re: Question on using Chi-square distribution for heterogeneity calculation
« Reply #2 on: November 03, 2015, 12:25:09 PM »
This question is specifically related to John Donovan's very useful lecture notes and pdf's that are posted in the "EPMA (and SEM) Education and Training" area of the board.  http://probesoftware.com/smf/index.php?topic=562.0

[...]

Does anyone have an idea where the difference is coming from?


I cannot speak as to the difference. The values are curiously similar, but clearly the 2 orders of magnitude would cause problems in the result (Eq. 7-30). This technique was developed by Dan Weill who had returned to Eugene where I saw him again at Dana's retirement party at DoGS. It was first used in one of his Masters student's thesis.  Jack and I rewrote the text and added quite a bit, but this technique used for evaluating all synthetic glass used for Dan's calorimetry was strictly his.

Keep in mind, in those years no one even had a calculator and at that time (’74-75) we relied heavily on slide rules and statistics tables only. Therefore, I would, and I dare say Dan would also have a difficult time helping Gian. We may have understood many of the principles of the statistics used in this chapter, but were quite unfamiliar with the derivation of many of the principles, eg, Chi-square. Good luck and I hope to keep up with this topic.

Cheers from Oliver's Pond, Avalon Peninsula

Gian Colombo

  • Post Doc
  • ***
  • Posts: 11
Re: Question on using Chi-square distribution for heterogeneity calculation
« Reply #3 on: November 03, 2015, 01:18:47 PM »
I just found my oversight.  Excel, Minitab, etc. are calculating the X2 statistic using the equation X2 = [(n-1)*s2]/sigma2

Eq. 7-29 in the lecture notes has divided both sides by (n-1).  As a result, the equation is for a X2 value that has been divided by the degrees of freedom (n-1).  The giveaway is in the paragraph above the table on page 22 where the title of the table from the CRC handbook is listed.  The title is 'percentage points, chi-square over degrees of freedom distribution'.  I didn't pick this up at first, but this is really saying that the table is giving the X2/(n-1) values for different percentage points.

If you take my original numbers for X2, which were 69.23 and 134.6 and divide them by 99, then you get 0.6993 and 1.3600, which matches the table on page 22.  So in order to make spreadsheet calculations work, the X2 values given by Excel's functions have to be divided by the degrees of freedom (n-1).

The devil is in the details.

Thanks,
Gian 

michael shaffer

  • Guest
Re: Question on using Chi-square distribution for heterogeneity calculation
« Reply #4 on: November 04, 2015, 03:22:06 AM »
I just found my oversight.  Excel, Minitab, etc. are calculating the X2 statistic using the equation X2 = [(n-1)*s2]/sigma2

Eq. 7-29 in the lecture notes has divided both sides by (n-1).  As a result, the equation is for a X2 value that has been divided by the degrees of freedom (n-1).  The giveaway is in the paragraph above the table on page 22 where the title of the table from the CRC handbook is listed.  The title is 'percentage points, chi-square over degrees of freedom distribution'.  I didn't pick this up at first, but this is really saying that the table is giving the X2/(n-1) values for different percentage points.

If you take my original numbers for X2, which were 69.23 and 134.6 and divide them by 99, then you get 0.6993 and 1.3600, which matches the table on page 22.  So in order to make spreadsheet calculations work, the X2 values given by Excel's functions have to be divided by the degrees of freedom (n-1).

The devil is in the details.

Thanks,
Gian

I looked for a common factor but missed your typo in the op (.6693). Still, the other pair should've given me the clue ...

Good catch!