Genome Research cityscape

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Extract Freely available
Right arrow Full Text (PDF)
Right arrow Erratum (v9,p1156)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Goodman, L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Goodman, L.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Vol. 9, Issue 8, 673-674, August 1999

EDITORIAL
Hypothesis-Limited Research

Students are taught that the proper way to do science is through the following steps: First, devise a hypothesis, and then design experiments that will prove or disprove their theory. The conscientious scientist follows this with a deliberate collection of data from these carefully crafted experiments, and the data prove or disprove the aforementioned theory, thereby placing a new stepping stone in the path through the jungle of our unknown universe.

It seems obvious that this is the appropriate way of moving forward in science. In fact, in most published papers, the hypothesis is put forth, followed by the experimental proof, and ending with a restatement of the veracity of the theory and potential future steps. Grants for funding research are also presented in such a light: a pure statement of theoretical intent, with a description of the experiments designed to determine the accuracy of the hypothesis. It is clear that pursuing an answer to a preformed theory is certainly a more cerebral quest than a mindless gathering of data, and it is much more appealing to the ivory-tower mentality in us all. But more importantly, it is obvious that just collecting data, with no hypothesis in mind at all, would be, in a word, wasteful. Gathering tremendous amounts of information with no thought to purpose would certainly provide information, but most of it would be so much clutter that the tremendous amount of time, energy, and money that would be required to gather all of the data one could imagine on a particular subject would be far too extensive to be worthwhile.

However, the current growth of high-throughput technologies for collecting some types of data may be tipping the balance with regard to the amount of waste that is really generated with regard to time, effort, and money [for examples of high-throughput expression data collection and analysis, see Eisen et al. (1998), Spellman et al. (1998), Chambers et al. (1999), Iyer et al. (1999), and Rhee et al. (1999)]. Reports such as these illustrate the growing ease with which data collection can be done. With data acquisition becoming so fast and growing cheaper by the day, perhaps the time has come to let go of the hypothesis part and simply take in every possible bit of data one can, only cataloging how it was taken and what the results were of every measure done. Previously, clever experiments were designed to put a hypothesis on a knife edge for dissection but also to provide the most cost-effective, straightforward way to get at the answer. But is it now actually more cost-effective to have large pools of data---any data---created and stored, without thought for what these data might be used? The sequencing of genomes is one of the largest hypothesis-free collections of data the biological community has so far. As a community resource it is invaluable, and if it were not being done by the community as a whole, it would be far more expensive, take much more time (if it were done at all), and certainly be more wasteful in the redundancy of laboratories providing overlapping information. The costs of other types of data acquisition may now also be low enough that even the collection of data that might never be used would add so little to the overall cost as to make hypothesis-free data collection still the most efficient means of advancing our scientific understanding of biological systems.

Once pooled, the data can be examined in any variety of ways by anyone in the community and can tell the story about what is there. Just a voyage of discovery--- no preconceived notion of what one might find---not unlike mapping some uncharted terrain, previously thought to end, perhaps, at the precipitous edge of the earth. Yes, one would need carefully designed tools for exploring, arranging, comparing, and cataloging. But the goal would be to sift through the data to find patterns that are present, not to bend the data to fit a theory as many have done, often unknowingly, in the past; as if they were writing in the center of an uncharted continent, "Monsters be here" and then imagination and belief ended up making it more difficult to determine the truth.

Perhaps it is also time to admit that the time-honored belief that good science constitutes first devising a hypothesis and then collecting data for proof does have a flaw---often ignored---one that crops up again and again in many a philosophy of science course; that is, that in reality it is really the hypothesis that follows the data. Hypotheses devised today for papers and grants actually stand on a tremendous foundation of data; they are not sprung from midair. The design of experiments to follow certainly does enable more testing of the theory, of course, but perhaps a communally available pool of data could take the place of many labs doing the same experiments and provide more unexpected discoveries than can be devised on fewer data.

This is, perhaps, the greater concern about clinging to hypothesis-driven research---the waste caused by what is missed. Most major scientific breakthroughs are the result of seeing unexpected patterns in data already gathered---patterns that might have been missed if one is bent on a set goal. For just a few examples, reach all the way back to Copernicus and the earth revolving around the sun, to Darwin and the theory of evolution, up through Barbara McKlintock and the discovery of transposons, and on to Tom Cech and self-processing mRNA. All were discoveries of surprise that the data alone revealed, and some of these discoveries met with resistance because of the limitations provided by the current hypotheses.

So perhaps the time has come to just do some mindless gathering of data. The cost seems to be growing less. The usefulness as a resource to the community appears quite high. Is this a heretical idea---to ease up just a bit on our perhaps errant belief that we knew all along what the data would tell us? To give up on proposing theory first and collecting data after? To do so would require a great number of changes, including how and whether nonhypothesis research is funded. But science, after all, is often called heretical for one reason or another. It seems worth gathering data to test this hypothesis and move into an era of pattern-detection research rather than continuing to do research that might be hypothesis limited.

    REFERENCES
TOP
INTRODUCTION
REFERENCES

Chambers, J., A. Angelo, D. Amaratunga, H. Guo, Y. Jiang, J.S. Wan, A. Bittner, K. Frueh, M.R. Jackson, P.A. Peterson, M.G. Erlander, and P. Ghazal. 1999. J. Virol. 73:,r 5757-5766.

Eisen, M.B., P.T. Spellman, P.O. Brown, and D. Botstein. 1998. Proc. Natl. Acad. Sci. 95: 14863-14868.

Iyer, V.R., M.B. Eisen, D.T. Ross, G. Schuler, T. Moore, J.C.F. Lee, J.M. Trent, L.M. Staudt, J. Hudson, Jr., M.S. Boguski, D. Lashkari, D. Shalon, D. Botstein, and P.O.Brown 1999. Science 283: 83-87

Rhee, C.H., K. Hess, J. Jabbur, M. Ruiz, Y. Yang, S. Chen, A. Chenchik, G.N. Fuller, and W. Zhang. 1999. Oncogene 18:12711-2717.

Spellman, P.T., G. Sherlock, M.Q.Zhang, V.R. Iyer, K. Anders, M.B. Eisen, P.O. Brown, D. Botstein, and B. Futcher. 1998. Mol. Cell. Biol. 9: 3273-3297.

Laurie Goodman


9:673-674 ©1999 by Cold Spring Harbor Laboratory Press  ISSN 1088-9051/99 $5.00

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Cancer Epidemiol. Biomarkers Prev.Home page
D. C. Thomas
Are We Ready for Genome-wide Association Studies?
Cancer Epidemiol. Biomarkers Prev., April 1, 2006; 15(4): 595 - 598.
[Full Text] [PDF]


Home page
Genome Res.Home page
J. Margolin
Of Mice, Men, and the Genome
Genome Res., October 1, 2000; 10(10): 1431 - 1432.
[Full Text]


Home page
Genome Res.Home page
J. C. Engert
Unlimited Hypothesis Research
Genome Res., March 1, 2000; 10(3): 271 - 272.
[Full Text]


Home page
Genome Res.Home page
K. Lastowski and W. Makalowski
Methodological Function of Hypotheses in Science: Old Ideas in New Cloth
Genome Res., March 1, 2000; 10(3): 273 - 274.
[Full Text]


This Article
Right arrow Extract Freely available
Right arrow Full Text (PDF)
Right arrow Erratum (v9,p1156)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Goodman, L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Goodman, L.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?


Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.