tag:blogger.com,1999:blog-3669150926823773792017-05-22T09:53:12.194-07:00I am LearningRushdi Shamshttp://www.blogger.com/profile/01629234737321187924noreply@blogger.comBlogger13125tag:blogger.com,1999:blog-366915092682377379.post-77585030071576256782012-01-23T11:53:00.000-08:002012-01-23T13:40:43.157-08:00Language of Scientific Paper- Part 1Hello everyone! This is going to be a series of blog posts on language of scientific research papers. When you write a paper, there are many do's and don'ts to follow. If you practice, then definitely someday your paper will become immune from grammatical or typographical error. This is one of the strengths of research papers. However, when you learn how to write scientific papers and submit some papers for proofread to your supervisor, you may have heard this things from her- "Your paper is well written, I just need to change the STYLISTICS/ RHETORICS and/or I need to modify some of the languages". Well, what type of change they do? Did you ever notice? If not, this is the right place for you. From my experience, I am going to give some tips on how to improve your stylistics or rhetorics and hence the language of your paper. Remember, this is the second line of strength of your paper. The most important strength is your work.<br /><br /><span style="font-weight:bold;">As are</span> <br />A very good term to start with. "As are" is used in the place of "which".<br /><br />For example, <span style="font-style:italic;">"The non-character tokens, which are any tokens that do not contain letters, are deleted"</span> is a fine sentence. Strong one. But what follows is a stronger sentence- <span style="font-style:italic;">"The non-character tokens are deleted, as are any tokens that do not contain letters"</span>.<br /><br /><span style="font-weight:bold;">Giveaway</span><br />Most common use of this term is replacing "disadvantage". For example, <span style="font-style:italic;">"The disadvantage of the algorithm is that it picks some garbage characters"</span> can be re-written with much more strength as <span style="font-style:italic;">"The giveaway of the algorithm is that it picks some garbage characters".</span><br /><br /><span style="font-weight:bold;">Comparably</span><br />This is very possible that your algorithm does not defeat other benchmarks but performs almost similar or very close to them (or perhaps sometimes better and sometimes not). This is the right term to use in these cases- <span style="font-style:italic;">"Our approach performs comparably to the state-of-the-art."</span><br /><br /><span style="font-weight:bold;">It goes on</span><br />I use this when I describe my methods, other people methods or reference other peoples' work. <span style="font-style:italic;">"The paper of X et al. [1] goes on to state their performances against the gold standard"</span><br /><br /><span style="font-style:italic;">Because</span><br />You know this word for sure! But I did not know that the word has powerful effect if you re-arrange the sentence <span style="font-style:italic;">"We could not achieve better F-score because the dataset was small"</span> with <span style="font-style:italic;">"Because the dataset was small, we could not achieve better score."</span><br /><br /><span style="font-weight:bold;">As well as/As good as</span><br />This is a synonym of what we learnt already "comparably" as in the sentence "Our approach performs as well as/ as good as the benchmarks"<br /><br /><span style="font-weight:bold;">A Priori</span><br />A substitute for "beforehand". Sometimes you may like to state something like "There is no way to answer questions like this before calculating crop factor". This gets a professional essence when you use a priori. "There is no way to answer questions like this a priori calculating crop factor".<br /><br /><span style="font-weight:bold;">Among them</span><br />Exclusively used when you give any example. <span style="font-style:italic;">"The algorithm uses many parameters- among them x and y- but z"</span> means that the algorithm uses parameters like x and y but not z.<br /><br /><span style="font-weight:bold;">Couch in</span><br />Used to illustrate "formulate in the same way". <span style="font-style:italic;">"Couched in the same terms as in arithmetic mean, geometric mean can be expressed in different way."</span><br /><br /><span style="font-weight:bold;">As well as</span><br />Simply, this means a conjunction "and". But also, this can be used at the beginning of a sentence. <span style="font-style:italic;">"It includes a wide range of processing tools and a variety of algorithms"</span> can be written as either <span style="font-style:italic;">"It includes a wide range of processing tools as well as a variety of algorithms" </span> or <span style="font-style:italic;">"As well as a variety of algorithms, it includes a wide range of processing tools".</span><br /><br /><span style="font-weight:bold;">Getting to know</span><br />To express a reason, why you do follow a method, this is a good word to pick. <span style="font-style:italic;">"As we know that the data is a part of the work, we developed data visualization tools"</span> can be written as <span style="font-style:italic;">"Getting to know that the data is a part of the work, we developed data visualization tools"</span>.<br /><br /><span style="font-weight:bold;">Way</span><br />When you explain the ways of doing some thing, you can follow the pattern as follows <span style="font-style:italic;">"One way to use the tools is .... Another is .... A third is ......"</span><br /><br /><span style="font-weight:bold;">Close second</span><br />An excellent phrase to depict something important (but also not that important you previously mentioned about a thing beforehand). For example, <span style="font-style:italic;">"This is the most valuable tool in the package. A close second is its visualization capabilities".</span><br /><br /><br />(To be continued)Rushdi Shamshttp://www.blogger.com/profile/01629234737321187924noreply@blogger.com5tag:blogger.com,1999:blog-366915092682377379.post-16119870314281635542011-09-26T13:43:00.000-07:002011-09-26T15:01:26.556-07:00Statistical Significance: FAQSay, you have collected humidity of 5 days for city A and city B- that are adjacent. <br /><br />City A = {40,45,42,50,42}<br />City B = {38,45,40,48,52}<br /><br />The average humidity of City A is 43.8 and City B is 44.6.<br /><br />Most of the times, we come to a decision that City B is more humid than City A by just taking a look at the average humidity. This is inappropriate. To tell this, we need to further investigate whether or not they are "statistically significant". Average is just a tool to assume not to claim.<br /><br />To say that two classes significantly differ from each other, we need to test their "statistical significance". There are numbers of tools to test it. I am not discussing them here because one can google them and read more about it. What I am answering here can be seen as FAQs. <br /><br /><span style="font-weight:bold;">What is parametric and non-parametric tests?</span><br /><br />If you know that your data of two classes follow normal distribution, then you can choose several significance tests that are parametric. If they don't then choose a non-parametric test.<br /><br /><a href="http://en.wikipedia.org/wiki/Non-parametric_statistics">Link to non-parametric test list</a><br /><br /><span style="font-weight:bold;">How do I know that my data of two classes follow normal distribution?</span><br /><br />A novice approach can be to have class intervals and frequency of occurrence, and then a plot. The plot should contain the class intervals in x-axis and frequency in y-axis. If they form a Bell shaped curve, then your data is following normal distribution. <br /><br />For in depth precise analysis, <a href="http://mathforum.org/library/drmath/view/72065.html">click here</a><br /><br /><a href="http://en.wikipedia.org/wiki/The_Bell_Curve">Click here if you don't know what a Bell curve is</a><br /><br />And to find several normality tests by which you can be confirmed that your data are normally distributed, <a href="http://en.wikipedia.org/wiki/Normality_test">click here</a><br /><br /><span style="font-weight:bold;">How do I determine whether I need a parametric test or non-parametric test?</span><br /><br />1. If you know that your data follow normal distribution, use parametric test; non-parametric test otherwise.<br /><br />2. Some values are extremely lower or higher and can even follow normal distribution. Use non-parametric test in this case.<br /><br />3. If you are confused about the distribution of sample, try to look at the whole dataset rather than the sample.<br /><br />4. Try to find out the sources that cause the data to scatter. If you have numbers of sources, then it is most probably following normal distribution.<br /><br />5. If you have large dataset, you can try any one of this- from experiment, it is proved that both of the tests perform well on large dataset. In contrast, they are poor on small dataset.<br /><br />Last but not the least, many people choose parametric tests as they are not confirmed if the data has lost following normal distribution and many people consider non-parametric tests as they are not sure if the data met the requirements to be normally distributed.<br /><br /><span style="font-weight:bold;">I have seen paired and unpaired tests- which is appropriate?</span><br /><br />If you feel that the values of your dataset match with each other, you have to experiment with unpaired tests, paired tests otherwise.<br /><br /><span style="font-weight:bold;">Good, I have seen one-sided and two-sided p value also- can you tell me about them</span><br /><br />First, tell me if you know what a null hypothesis is.<br /><br /><span style="font-weight:bold;">No, what is a null hypothesis?</span><br /><br />A null hypothesis tells that there is no statistical significance between the two datasets. If you see their average is differing, they are differing by chance only.<br /><br /><span style="font-weight:bold;">Oh, okay, then tell me now about the one-sided and two-sided p value.</span><br /><br />If the null hypothesis is true, the one-sided P value is the probability that two averages would differ as much as was observed or further (see the example, they differ, don't they?) in the direction specified by the hypothesis just by chance, even though the means of the overall populations are actually equal. The two-sided P value also includes the probability that the sample means would differ that much in the opposite direction (i.e., the other group has the larger mean). The two-sided P value is twice the one-sided P value.<br /><br /><span style="font-weight:bold;">So, when should I use them?</span><br /><br />When you can state with certainty (and before collecting any data) that there either will be no difference between the means or that the difference will go in a direction you can specify in advance (i.e., you have specified which group will have the larger mean), you should use a one-sided p value during your test, otherwise select a two-sided P value.<br /><br />1. If you select a one-sided test, you should do so before collecting any data<br />2. You need to state the direction of your experimental hypothesis. <br />3. If the data go in the "wrong" direction, then you should use a two-sided P value. <br /><br />It is recommend that you always calculate a two-sided P value.Rushdi Shamshttp://www.blogger.com/profile/01629234737321187924noreply@blogger.com0tag:blogger.com,1999:blog-366915092682377379.post-33780384530369309112011-08-21T14:08:00.001-07:002012-01-23T04:40:43.887-08:00Micro- and Macro-average of Precision, Recall and F-ScoreI posted several articles explaining how precision and recall can be calculated, where F-Score is the equally weighted harmonic mean of them. I was wondering- how to calculate the average precision, recall and harmonic mean of them of a system if the system is applied to several sets of data.<br /><br />Tricky, but I found this very interesting. There are two methods by which you can get such average statistic of information retrieval and classification. <br /><br /><span style="font-weight:bold;">1. Micro-average Method<br /></span><br />In Micro-average method, you sum up the individual true positives, false positives, and false negatives of the system for different sets and the apply them to get the statistics. For example, for a set of data, the system's<br /><br />True positive (TP1)= 12<br />False positive (FP1)=9<br />False negative (FN1)=3<br /><br />Then precision (P1) and recall (R1) will be 57.14 and 80<br /><br />and for a different set of data, the system's<br /><br /><br />True positive (TP2)= 50<br />False positive (FP2)=23<br />False negative (FN2)=9<br /><br />Then precision (P2) and recall (R2) will be 68.49 and 84.75<br /><br />Now, the average precision and recall of the system using the Micro-average method is<br /><br />Micro-average of precision = (TP1+TP2)/(TP1+TP2+FP1+FP2) = (12+50)/(12+50+9+23) = 65.96<br />Micro-average of recall = (TP1+TP2)/(TP1+TP2+FN1+FN2) = (12+50)/(12+50+3+9) = 83.78<br /><br />The Micro-average F-Score will be simply the harmonic mean of these two figures.<br /><br /><span style="font-weight:bold;">2. Macro-average Method<br /></span><br />The method is straight forward. Just take the average of the precision and recall of the system on different sets. For example, the macro-average precision and recall of the system for the given example is<br /><br />Macro-average precision = (P1+P2)/2 = (57.14+68.49)/2 = 62.82<br />Macro-average recall = (R1+R2)/2 = (80+84.75)/2 = 82.25<br /><br />The Macro-average F-Score will be simply the harmonic mean of these two figures.<br /><br /><span style="font-weight:bold;">Suitability</span><br />Macro-average method can be used when you want to know how the system performs overall across the sets of data. You should not come up with any specific decision with this average. <br /><br />On the other hand, micro-average can be a useful measure when your dataset varies in size.Rushdi Shamshttp://www.blogger.com/profile/01629234737321187924noreply@blogger.com9tag:blogger.com,1999:blog-366915092682377379.post-86407318980768200722011-08-12T02:00:00.000-07:002011-08-12T02:06:42.390-07:00Research Writing: That or Which?Very simple explanation on "that" and "which". But I would say- this one is the best clarification I found on the web so far. I am confident enough now to use either of the two as the papers I review mostly mix up them.
<br />
<br />[Originally from Mignon Fogarty]
<br />
<br />"
<br /><span style="font-weight:bold;">Restrictive Clause--That</span>
<br />
<br />A restrictive clause is just part of a sentence that you can't get rid of because it specifically restricts some other part of the sentence. Here's an example:
<br />
<br /><span style="font-style:italic;">Gems that sparkle often elicit forgiveness.</span>
<br />
<br />The words that sparkle restrict the kind of gems you're talking about. Without them, the meaning of the sentence would change. Without them, you'd be saying that all gems elicit forgiveness, not just the gems that sparkle. (And note that you don't need commas around the words that sparkle).
<br />
<br /><span style="font-weight:bold;">Nonrestrictive Clause--Which</span>
<br />
<br />A nonrestrictive clause is something that can be left off without changing the meaning of the sentence. You can think of a nonrestrictive clause as simply additional information. Here's an example:
<br />
<br /><span style="font-style:italic;">Diamonds, which are expensive, often elicit forgiveness.
<br /></span>
<br />Leaving out the words <span style="font-style:italic;">which are expensive</span> doesn't change the meaning of the sentence. (Also note that the phrase is surrounded by commas. Nonrestrictive clauses are usually surrounded by, or preceded by, commas.
<br />"Rushdi Shamshttp://www.blogger.com/profile/01629234737321187924noreply@blogger.com0tag:blogger.com,1999:blog-366915092682377379.post-80810850043739531342011-08-09T21:39:00.000-07:002011-08-09T21:40:58.920-07:00He as well as Me or He as well as I?Found a very useful article (don't know the name of the poster in a language forum, but I am acknowledging him/ her with the deepest).
<br />
<br />"as well as" functions as a conjunction in #1, not a preposition:
<br />
<br />#1. She was into drama and took part in many youth theater productions as well as [took part in] singing in choirs.
<br />
<br />"as well as" has two functions:
<br />
<br />conjunction: courageous as well as strong.
<br />preposition: The rhetoric, as well as the reasoning, is appreciated.
<br />
<br />Notice the commas on each side of the prepositional phrase. They set off or bar the grammar from counting it as part of the subject. That's why the verb is singular "is", and not plural "are". Take the commas away and the prepositional phrase changes identity. It becomes a conjunction + noun phrase that's counted as part of the subject:
<br />
<br />conjunction: The rhetoric as well as the reasoning are appreciated.
<br />
<br />Below in #2a, there aren't any commas setting off "as well as" from the grammar, so it's counted as part of the subject. "He as well as I" is a compound subject so the verb should be plural "are" (#2b), not singular "is":
<br />
<br />#2a. He as well as I is satisfied with the result. <ungrammatical>
<br />#2b. He as well as I are satisfied with the result.
<br />
<br />Subject verb agreement is also a problem for #3a. "He as well as me" is a compound subject; the verb should be plural:
<br />
<br />#3a. He as well as me is satisfied with the result. <ungrammatical>
<br />#3b. He as well as me are satisfied with the result.
<br />
<br />Now, add in the commas and "as well as" functions as a preposition,
<br />
<br />#2c. He, as well as I, is satisfied with the result. <awkward>
<br />#3c. He, as well as me, is satisfied with the result. <awkward>
<br />
<br />As a conjunction, "as well as" joins two like forms;i.e., courageous as well as strong; you as well as Sam, but in #3b, below, "as well as" joins two unlike forms, the subject pronoun "He" and the object pronoun "me".
<br />
<br />#3b. He as well as me are satisfied with the result. <non-standard>
<br />#3d. He as well as I are satisfied with the result. <standard>
<br />
<br />Now, "as well as me" is non-standard English, but nevertheless speakers will use "me" as well as "myself" as a way of placing the other person above them. It's a way of humbling oneself. Rushdi Shamshttp://www.blogger.com/profile/01629234737321187924noreply@blogger.com2tag:blogger.com,1999:blog-366915092682377379.post-89289278449337838992011-08-04T09:28:00.000-07:002011-08-04T09:38:57.852-07:00Excel Graph to EPS (MS Office 2007)Here is how I convert an excel (MS Office 2007) graph to EPS so that I can use that graph in a TEX file.<br /><br />1. Open MS Excel. Copy the graph to a new sheet and be careful so that it fits in one page (you can double check if the graph fits in a page from print preview option).<br /><br />2. Go to File-> Print-> Properties-> Advanced-> Postscript Option and select EPS. <br /><br />3. The file will be saved as EPS so give the file an extension .eps<br /><br />4. Open the GSview and open the .eps file you just saved. Go to File and select ps to eps. The file should be given an extension of .eps. This is the final eps file that you can insert in your TEX code.Rushdi Shamshttp://www.blogger.com/profile/01629234737321187924noreply@blogger.com4tag:blogger.com,1999:blog-366915092682377379.post-3282155059988376572011-08-04T09:16:00.000-07:002011-10-24T12:13:44.804-07:00From Tex to PDFThere are several ways to generate a PDF from a TEX file. I am stating the most popular 4 ways here.<br /><br /><span style="font-weight:bold;">Method 1</span><br /><br /><span style="font-style:italic;">If you do not have bibliography file</span><br /><br />% latex myfile (to generate myfile.dvi from myfile.tex) <br />% dvips myfile (to generate myfile.ps from myfile.dvi) <br />% ps2pdf myfile.ps (to generate the file myfile.pdf) <br /><br /><span style="font-weight:bold;">Method 2</span><br /><br /><span style="font-style:italic;">If you have bibliography file</span><br /><br />% latex myfile (to generate myfile.dvi from myfile.tex) <br />% bibtex myfile (uses the .aux file to extract cited publications from the database in the .bib file, formats them according to the indicated style, and puts the results into in a .bbl file)<br />% dvips myfile (to generate myfile.ps from myfile.dvi) <br />% ps2pdf myfile.ps (to generate the file myfile.pdf)<br /><br /><span style="font-weight:bold;">Method 3</span><br /><br /><span style="font-style:italic;">If you want to convert a TEX file directly to PDF and do not have a bibliography file<br /></span><br /><br />% pdflatex myfile<br /><br />N.B. If you have images in EPS format, you need to convert it into PDF format with the following command-<br /><br />% epstopdf image.eps<br /><br /><span style="font-weight:bold;">Method 4</span><br /><br /><span style="font-style:italic;">If you want to convert a TEX file directly to PDF and have a bibliography file<br /></span><br /><br />% pdflatex myfile<br />% bibtex myfile<br />% pdflatex myfile<br />% pdflatex myfile<br /><br /><br />N.B. If you have images in EPS format, you need to convert it into PDF format with the following command-<br /><br />% epstopdf image.epsRushdi Shamshttp://www.blogger.com/profile/01629234737321187924noreply@blogger.com0tag:blogger.com,1999:blog-366915092682377379.post-49253530728706770652011-08-03T15:18:00.001-07:002011-08-03T15:24:23.020-07:00Two Figures Side-by-Side in LatexMany times you can come across a situation where you need to put two figures side-by-side in a paper written in latex format. Well, of course, they are two different figures: one is say Figure 1 and the other is say Figure 2. How are you going to achieve this? Well, simply the following latex code helps.<br /><br /><br /><blockquote><br />\begin{figure}[b]<br />\begin{minipage}[t]{0.48\linewidth}<br />\centering<br />\includegraphics[scale=0.4]{\string"Path and file name without extension\string".pdf}<br />\caption{Number of new connections in five chunks for six papers on Ischemia<br />and Glutamate}<br />\label{fig:figure1}<br />\end{minipage}<br />\hspace{0.5cm}<br />\begin{minipage}[t]{0.48\linewidth}<br />\centering<br />\includegraphics[scale=0.4]{\string"Path and file name without extension\string".pdf}<br />\caption{Number of dropped connections in five chunks for six papers on Ischemia<br />and Glutamate}<br />\label{fig:figure2}<br />\end{minipage}<br />\end{figure}<br /></blockquote><br /><br />The thing to remember is to create minipages. One minipage serves the purpose of putting one figure. All you need to adjust here is the position of the figure (here, I chose the bottom of the page), position of the linewidth of the minipage (mine here is 0.48, yours definitely depend on the page size), and the graphics size.Rushdi Shamshttp://www.blogger.com/profile/01629234737321187924noreply@blogger.com0tag:blogger.com,1999:blog-366915092682377379.post-79636500979429677272011-04-05T10:58:00.001-07:002011-04-05T11:14:36.326-07:00Qualitative and Quantitative AnalysisOften in research papers, we find that "quantitative analysis on A and B could not be performed... so, we proposed a qualitative analysis on A and B"<br /><br />What do they mean by quantitative and qualitative analysis and which fits in what cases?<br /><br />Let me give an example. Sachin Tendulkar and Inzamam ul Huq are two great players of cricket. They have qualities- they are master batsmen, they have strokes, they have techniques and strategies. They, however, have quantities- number of hundreds, number of fifties, number of games played, games as captains.<br /><br />Now, if someone tells you to make a quantitative analysis on them, what you do is to chart their statistics and make a compare-contrast between them. If someone asks you to provide a qualitative analysis, you analyze their quality as a cricketer and its impact on their game and their teams.<br /><br />Quantitative analysis is not suitable here. Because, Sachin is opening batsman and has plenty of chances to score hundreds and fifties than Inzamam- who is a middle order batsman. This is just one example- you can find many others. As they are differing in batting positions, quantitative analysis as batsman is not a good choice. However, you can quantify their records as captains. Again, in some cases you cannot do that. Like- maybe Sachin took over captaincy when his team lacked quality players but Inzamam, for example, had the best men in his team. So, it depends.<br /><br />Quantitative analysis should be done more carefully in compare to qualitative analysis. But again, sometimes it depends. You cannot make a qualitative analysis on Waqar Younis and Sachin Tendulkar as bowlers- it simply is wrong.<br /><br />I hope you are now able to map these examples in your research and find a way to analyze either quantitatively or qualitatively or even both!Rushdi Shamshttp://www.blogger.com/profile/01629234737321187924noreply@blogger.com0tag:blogger.com,1999:blog-366915092682377379.post-35734064333552466202011-03-24T08:55:00.000-07:002011-03-24T09:07:29.472-07:00True Negatives and AccuracyHello folks! Welcome back! We have talked about Precision and Recall and eventually three terminologies appeared: True positives, false positives and false negatives. Today, we will talk about accuracy and hence, we need to know another term: True negative.<br /><br />Let's go back to our old example. <br /><br />You were given 10 balls in a box, 6 of which are white and 4 are red. You were asked to pick only the red balls from the box and you picked 7 balls- 2 of them are really red but 5 white balls fooled you.<br /><br />Now, our true positives are 2, false positives are 5 and false negatives are 2. <br /><br />A true negative is what you thought negative and really was negative: in our case, which is- you thought a ball as white in the box, and that ball is really white. So, number of true negatives for our example is:<br /><br />True negatives = Total Balls - (True positives + False Positives + False Negatives)<br />= 10 - (2 + 5 + 2) = 1.<br /><br />Why are we eager to know the number of true negatives? Well, because if we want to measure how accurate you were in picking up red balls, then we need to know your true negatives as well.<br /><br />The formula for accuracy is<br /><br />Accuracy = (True positives + True negatives) / (True positives + True negatives + False positives + False negatives)<br /><br />In your case, your accuracy is equal o (2 + 1) / 10 = 30%.Rushdi Shamshttp://www.blogger.com/profile/01629234737321187924noreply@blogger.com0tag:blogger.com,1999:blog-366915092682377379.post-36135026106260261562011-03-19T19:44:00.000-07:002011-03-19T19:50:09.405-07:00Efficiency and EffectivenessDo you use these terms as synonyms? You are wrong!<br /><br />Efficiency is scientifically output / input. Efficiency is measured to maximize output with minimum resources. It refers to doing things in right way. Say, you produce 10 kgs of potatoes with 1 kg of fertilizer. To be efficient, you need to produce more potatoes with fertilizer measuring 1 kg or less.<br /><br />Effectiveness, on the other hand, means doing the right thing. Say, you produce 10 kgs of potatoes with 1 kg of fertilizer. To be effective as a farmer, you need to produce more good potatoes than bad potatoes; you don't have to bother about your fertilizer.Rushdi Shamshttp://www.blogger.com/profile/01629234737321187924noreply@blogger.com0tag:blogger.com,1999:blog-366915092682377379.post-31667544852241179032011-03-19T19:35:00.001-07:002011-03-19T19:44:03.094-07:00Proactive and Reactive ResearchSometimes you hear from your supervisor- "You have to be proactive in research" or sometimes they ask you to be reactive. What does he mean?<br /><br />Proactive activity means you are cautious about your future, you are planned to face problems in future, or you are prepared for something "bad". When you are saving money for facing troubles in future, it is said to be a proactive activity. So, proactive research means if you know a "situation" may occur in future, prepare yourself. To understand "situations" that may occur, you need to dig your problem and find all possible "situations" that may occur in future. <br /><br />So, you are a proactive researcher when you analyze your problem and formulate solutions prior to some situations in future. <br /><br />Reactive activity means a little bit carelessness: I will provide solutions when "situation" occurs or I will react when time comes. So, reactive researchers do not bother about future; they just do what they are meant to do and when they face problems, they try to find a solution. <br /><br />Mostly, in my opinion, you need to have a blending characteristic to succeed in research: at times you need to proactive and sometimes being reactive will bring you success.Rushdi Shamshttp://www.blogger.com/profile/01629234737321187924noreply@blogger.com1tag:blogger.com,1999:blog-366915092682377379.post-61984745986148471152011-03-17T11:00:00.001-07:002011-03-19T19:24:02.922-07:00Precision and RecallThese are very confusing terms- precision and recall. You have to understand these terms completely before you are moving forward.<br /><br />Say, you have 10 balls (6 white and 4 red balls) in a box. I know you are not colorblind but still somebody asked you to pick up the red balls from them. What you did is that you thought 7 balls as red, picked them from the box and put them in a tray. Among these 7 balls, you picked 2 red balls and 5 white balls (but you thought all of them are red).<br /><br />Your precision in picking red ball is number of correct pick-ups/(number of correct pick-ups + number of wrong pick-ups) which is 2/(2+5) = 2/7 = 28% in this case. Now, look carefully that your denominator can also be like (total pick-ups).<br /><br />Your recall in picking red ball is number of correct pick-ups/(number of correct pick-ups + number of red balls that you missed) which is 2/(2+2) = 2/4 = 50% in this case.<br /><br />Now, what do they mean? Precision says how exact you were among your pick-ups. So, as you picked them up as red balls, you were 28% exact. Recall says, how complete you were among your pick-ups. So, as you picked them up as red balls, you were 50% complete in identifying all the red balls.<br /><br />We learn at this point- Precision describes exactness and Recall describes completeness.<br /><br />From the same example, we will now take a look how to combine various terminology with these simple examples.<br /><br />Number of correct pick-ups can be said "true positives" as they were red ball that you picked up and you were asked to pick the red ones. The balls you picked as red but eventually are white can be called "false positives"- you thought they are positive but they are not.<br /><br />So, if we modify this formula of precision with these terms, it turns into-<br /><br />Precision = true positives / (true positives + false positives)<br /><br />Again, the number of red balls you missed are thought as you missed them thinking them as white. So, they can be called "false negatives", which means you thought they are not red balls, but they are.<br /><br />So, if we modify this formula of recall with this new terminology, it turns into-<br /><br />Recall = true positives / (true positives + false negatives)<br /><br />So, from this "re-written" version of recall formula can pop-up one thing: this is the "rate of true positives". In other words, from all the red balls, what percentage of red balls were grabbed. You had 4 red balls but you got 2 and missed 2: means you could took 50% of the red balls!<br /><br />THIS IS WHAT YOU WILL FIND PRECISION AND RECALL IN THE REALM OF CLASSIFICATION PROBLEM.<br /><br />Now, we will move to a different realm: INFORMATION RETRIEVAL. It will require sets, so we will change our scenario a little. <br /><br />You have 10 files in a folder, 4 of which are about games and 6 of which are about weather. Now, somebody asked you to copy only the game files to another location. And what you did is you copied 7 files (thinking all of the 7 files you picked up is about games) and put them in a different location. But you picked only 2 game files and 5 weather files (but you thought they are all games files).<br /><br />REMEMBER, THE ANALOGY HERE IS THE SAME AS THE RED BALL-WHITE BALL PROBLEM.<br /><br />Now, your precision of copying games files will be:<br /><br />Precision = Number of games file both in new location and in old location / number of files you copied.<br /><br />In this case, which is 2/7 = 28% (the same as our previous example)<br /><br />Now, your recall of copying games files will be:<br /><br />Precision = Number of games file both in new location and in old location / total number of games files.<br /><br />In this case, which is 2/4 = 50% (the same as our previous example)<br /><br />Some more identical definitions or explanations of these two terms:<br /><br />Precision<br />- A measure of the ability of a system to present only relevant items<br />- The fraction of correct instances among all instances that the algorithm believes to belong to the relevant set<br />- It is a measure of exactness or fidelity<br />- It tells how well a system weeds out what you don't want (Confused about this, but it is written in a document)<br />- Says nothing about the number of false negatives<br /><br /><br />Recall<br />- A measure of the ability of a system to present all relevant items<br />- The fraction of correct instances among all instances that actually belong to the relevant set <br />- It is a measure of completeness<br />- It tells how well a system performs to get what you want<br />- Says nothing about the number of false positivesRushdi Shamshttp://www.blogger.com/profile/01629234737321187924noreply@blogger.com0