Total Pageviews

Sunday, 21 August 2011

Micro- and Macro-average of Precision, Recall and F-Score

I posted several articles explaining how precision and recall can be calculated, where F-Score is the equally weighted harmonic mean of them. I was wondering- how to calculate the average precision, recall and harmonic mean of them of a system if the system is applied to several sets of data.

Tricky, but I found this very interesting. There are two methods by which you can get such average statistic of information retrieval and classification.

1. Micro-average Method

In Micro-average method, you sum up the individual true positives, false positives, and false negatives of the system for different sets and the apply them to get the statistics. For example, for a set of data, the system's

True positive (TP1)= 12
False positive (FP1)=9
False negative (FN1)=3

Then precision (P1) and recall (R1) will be 57.14 and 80

and for a different set of data, the system's


True positive (TP2)= 50
False positive (FP2)=23
False negative (FN2)=9

Then precision (P2) and recall (R2) will be 68.49 and 84.75

Now, the average precision and recall of the system using the Micro-average method is

Micro-average of precision = (TP1+TP2)/(TP1+TP2+FP1+FP2) = (12+50)/(12+50+9+23) = 65.96
Micro-average of recall = (TP1+TP2)/(TP1+TP2+FN1+FN2) = (12+50)/(12+50+3+9) = 83.78

The Micro-average F-Score will be simply the harmonic mean of these two figures.

2. Macro-average Method

The method is straight forward. Just take the average of the precision and recall of the system on different sets. For example, the macro-average precision and recall of the system for the given example is

Macro-average precision = (P1+P2)/2 = (57.14+68.49)/2 = 62.82
Macro-average recall = (R1+R2)/2 = (80+84.75)/2 = 82.25

The Macro-average F-Score will be simply the harmonic mean of these two figures.

Suitability
Macro-average method can be used when you want to know how the system performs overall across the sets of data. You should not come up with any specific decision with this average.

On the other hand, micro-average can be a useful measure when your dataset varies in size.

70 comments:

  1. Thanks for the article, just a minor typo: second to last paragraph: Micro rather than macro.

    ReplyDelete
    Replies
    1. Thanks a lot, Yahya. I corrected it. But the typo is in the last paragraph. If your dataset varies in size, micro-average is the useful tool.

      Delete
    2. hi, thank you for the article, please i have a general question related to learning algorithms. I found this expression in an article : The training data for the SVM classi ers are highly unbalanced as the proportion of positive training web pages ranges from 2% to 18%.
      What do we mean by unbalanced data? and by positive/negative training?
      ReplyDelete

      Delete
  2. This comment has been removed by the author.

    ReplyDelete
  3. I think there's a problem with your macro-average precision calculation. the denominator should be the sum of true positives and false positives. The micro average for precision and recall are the same. See here:
    http://metaoptimize.com/qa/questions/8284/does-precision-equal-to-recall-for-micro-averaging

    ReplyDelete
  4. @Pallika. The scenarios are different. In my case, one classifier is applied on two different datasets. The number of instances are different in two datasets. But the example you provided is classifier performance for three different class labels. So the sum of positive and negative is always 27. The two macro and micro evaluation is different.

    ReplyDelete
  5. Hi..Thank you for the article. I did some exparimentation on Text classification which consists 7 (labled documents prepared on my own)ctegories of documents.I did too much stemming and i got average precision 57%, micro precision=macro precision=1,micro recall=macro recall=0.please suggests me something to improve results(like bad training set,light stemming, threshold value and threshold step size modification, k if classifier is knn,how to improve recall for better F1 score).
    Thanks.

    ReplyDelete
  6. Good article.... if you explain what is micro and macro average in 3 lines it will be more hepful for readers...

    ReplyDelete
  7. This comment has been removed by the author.

    ReplyDelete
  8. Really sweet, thank you! That was hell of a good explanation! :)

    ReplyDelete
  9. Excellent article. Very interesting to read. I really love to read such a nice article. Thanks! keep rocking. Data Science online Training Hyderabad

    ReplyDelete
  10. Thank you for sharing your article. Great efforts put it to find the list of articles which is very useful to know, Definitely will share the same to other forums.
    Data Science Training in chennai at Credo Systemz | data science course fees in chennai | data science course in chennai quora | data science with python training in chennai

    ReplyDelete
  11. Thank you very much for sharing your article

    ReplyDelete
  12. I like this post and thanks to share. Post is very nice, If you having any errors from your antivirus service call our toll free number 1-888-883-9839 (USA)
    For more Information visit our websites:- https://www.antivirussupporthelplinenumber.us/

    ReplyDelete
  13. sklearn implements 'macro' f1 as average of the two f1's and not harmonic means of the average precision and recall.

    f1 = (f1_1 + f1_2) / 2

    Check: https://github.com/scikit-learn/scikit-learn/blob/51a765a/sklearn/metrics/classification.py#L1098

    ReplyDelete
  14. Excellent Blog. I really want to admire the quality of this post. I like the way of your presentation of ideas, views and valuable content. No doubt you are doing great work. I’ll be waiting for your next post. Thanks .Keep it up! Kindly visit us @Luxury Boxes
    Premium Packaging
    Luxury Candles Box
    Earphone Packaging Box
    Wireless Headphone Box
    Innovative Packaging Boxes
    Wedding gift box
    Leather Bag Packaging Box
    Cosmetics Packaging Box
    Luxury Chocolate Boxes

    ReplyDelete
  15. thanks for your information really good your work web design company in velachery

    ReplyDelete
  16. thanks for your information really good and very nice web design company in velachery

    ReplyDelete
  17. the blog is very use for me .keep sharing like this.web design company in velachery

    ReplyDelete
  18. This is a great inspiring article.I am pretty much pleased with your good work.You put really very helpful information.web design company in velachery

    ReplyDelete


  19. Soma pill is very effective as a painkiller that helps us to get effective relief from pain. This cannot cure pain. Yet when it is taken with proper rest, it can offer you effective relief from pain.
    This painkiller can offer you relief from any kind of pain. But Soma 350 mg is best in treating acute pain. Acute pain is a type of short-term pain which is sharp in nature. Buy Soma 350 mg online to get relief from your acute pain.

    https://globalonlinepills.com/product/soma-350-mg/


    Buy Soma 350 mg
    Soma Pill
    Buy Soma 350 mg online



    Buy Soma 350 mg online
    Soma Pill
    Buy Soma 350 mg

    ReplyDelete
  20. Wonderful blog post, thank you so much for the great information which you provided.
    https://rokucomlinks.org/

    ReplyDelete
  21. Hiiii....Thanks for sharing Great info...Nice post...Keep move on...
    Python Training in Hyderabad

    ReplyDelete