Thoughts on knowledge discovery versus hypothesis generation
reading some of the philosophy behind scientific reasoning...discussions of incommensurability from Polanyi, Kuhn, Feyerabend...noting that Kuhn believed alternate hypotheses should be entertained only when science is at a point of crisis while Feyerabend suggested we might always want to entertain alternatives as a necessity perhaps of scientific reasoning...someone suggests that Kuhn's theories do not apply to biology...oxidative phosphorylation is perhaps a good example for examining incommensurability, as two competing theories battled from 1960 or 61 until the late 70s....
Some argument
If it is our goal to mine information from the literature, and if the best we can do with mining text is to generate hypotheses rather than make discoveries (because making knowledge discoveries requires a tight coupling between the signifier and the signifiedm between the world and the language used to describe it, whereas generating hypotheses suggests we are merely producing ideas that need to be tested based on what's already present--generated hypotheses as emergent properties of the interaction of texts and text mining applications), then, given that information is essentialy what captures variance, we might want to generate as best as possible orthodox hypotheses--hypotheses with a majority of data already confirming those hypotheses--and, more importantly, heterodox hypotheses--hypotheses suggested by the literature with little or no data supporting them. The more heterodox a hypothesis, the greater the potential for information. It's a risk/reward trade-off: the costs in testing heterodox hypotheses given the reduced likelihood of veracity may be offset by the motherlode potential of confirming but one extremely heterodox theory.
Further, we may want to mine the various ways these hypotheses, whether heterodox or orthodox, are contradicted in the literature. It seems out burden in science is to posit a hypothesis and then try our best to disprove it in every way possible, rather than try to prove it.
http://tinyurl.com/4y39s
http://tinyurl.com/5j5b5
Metaphors defining our cognitive scaffolding, which in turns defines and limits how it is we witness novel things (or any thing, for that matter) and deem them interesting
Some argument
If it is our goal to mine information from the literature, and if the best we can do with mining text is to generate hypotheses rather than make discoveries (because making knowledge discoveries requires a tight coupling between the signifier and the signifiedm between the world and the language used to describe it, whereas generating hypotheses suggests we are merely producing ideas that need to be tested based on what's already present--generated hypotheses as emergent properties of the interaction of texts and text mining applications), then, given that information is essentialy what captures variance, we might want to generate as best as possible orthodox hypotheses--hypotheses with a majority of data already confirming those hypotheses--and, more importantly, heterodox hypotheses--hypotheses suggested by the literature with little or no data supporting them. The more heterodox a hypothesis, the greater the potential for information. It's a risk/reward trade-off: the costs in testing heterodox hypotheses given the reduced likelihood of veracity may be offset by the motherlode potential of confirming but one extremely heterodox theory.
Further, we may want to mine the various ways these hypotheses, whether heterodox or orthodox, are contradicted in the literature. It seems out burden in science is to posit a hypothesis and then try our best to disprove it in every way possible, rather than try to prove it.
http://tinyurl.com/4y39s
http://tinyurl.com/5j5b5
Metaphors defining our cognitive scaffolding, which in turns defines and limits how it is we witness novel things (or any thing, for that matter) and deem them interesting
0 Comments:
Post a Comment
<< Home