Author Topic: How do you deal with low or high analytical totals when calculating statistics?  (Read 3637 times)

Gian Colombo

  • Post Doc
  • ***
  • Posts: 11
Hello all,

We are receiving requests for analyses on metal specimens that require us to analyze multiple points and calculate homogeneity statistics.  During the analysis runs, I have some points with analytical totals outside of the normally accepted 98wt% - 102wt% range.  I believe I understand the statistical equations required to calculate the parameters of interest, but I am under the impression that the stats for each element should be calculated from a data set consisting only of data points that have "acceptable" analytical totals.  What should I do with the points that are outside the 98wt%-102wt% range? 

Should I assume the analysis at those points is unreliable and remove them from the data set before calculating the stats?

Should I normalize the elements to bring the total to 100wt%?

Is there a point where you would consider the whole analysis to be unreliable if you have an excessive number of points outside the 98wt%-102wt% range?  10%? 25%?   

Ben Buse

  • Professor
  • ****
  • Posts: 499
Hi,

What are you analyzing? Under normal operating conditions/routine analysis as you say you should achieve 98-102% (where counting error on major elements and thus total < 2%).

In which case I think you should throw out all the data outside this range as unreliable and then do stats.

Exceptions: If you have hydrous phases then the total - maybe should be lower. If you have Fe3+ and you've calculated as Fe2+ then again the total could be lower than 98% even though the data very good.

You should not normalize your data as the causes of the high or low total will not operate the same on all elements equally.

Ben
« Last Edit: October 29, 2015, 05:20:13 AM by Ben Buse »

Gian Colombo

  • Post Doc
  • ***
  • Posts: 11
I am analyzing metal specimens that are generally single phase.  Occasionally I will get a request to analyze a specimen with two phases and report the average composition and stats for each phase, but in all cases I am expecting to get totals in the 98-102 range. 

I agree that normalizing would not be a good idea.  The idea to normalize data had come up when reviewing an old data set that had a significant number of points with low analytical totals, maybe 25%.  The basic idea was to "salvage the run" and get some use out of the data because losing a quarter of the data points reduced the power of the statistical tests.  I was thinking that a data set with that many points outside the 98-102 range should be treated as suspicious and scrapped, but I didn't really have a good reason other than my own "gut feeling".

Any thoughts on the percent of unreliable data points in a data set?

Thanks very much for the help,
Gian

Ben Buse

  • Professor
  • ****
  • Posts: 499
Sounds tricky, the trouble is you don't know what is causing those low totals and therefore what method should be used to correct the data.

More generally does the introduction of data points with a larger error (lets say a relative % error of the deviation from 100) into dataset - improve the stats on that dataset or not?

Don't know

Ben