10 Steps to Success in Bioinformatic
```````````````````````````````````````````````````````````````````````````````````````````````````````````````
-
Become a biologist.
This message comes directly from a keynote address given by David Botstein at the 1997 RECOMB meeting.
My interpretation of the adage is to organize one’s value system so that the driving principle becomes solving biological problems, r
ather than looking for biological problems to illustrate research in other areas.
That is, be driven to solve problems concerning cancer,
AIDS, ageing, development, evolution, etc., not by guidelines used in other disciplines, such as "my proof is harder than your proof".
However, this is still quite different from what I call the traditional way of doing biology, which is to work on the same problem for years,
applying any techniques (bioinformatic, sequencing, mouse transgenics, etc.) needed to get answers.
Sometimes scientists who take that approach
become experts in a required technique, but often they assign the task to a lab member or buy the service from a vendor.
As a bioinformatic specialist, you will likely not work that way. Instead, you might bring your acquired tool set to various biological
problems with the arrival of a new collaborator, set of data, or technology. For instance, your institution might buy a new kind of sequencing
instrument, which would permit your local cancer biologist to address novel hypotheses if only she could find someone to analyze the sequence data. (
That would be you.)
Becoming a biologist allows you to anticipate which computational challenges are most worthy of a substantial investment in software development.
Moreover, communications with collaborators become much easier.
Finally, I predict that as you get older, cancer and other aspects of human health will
come to seem more interesting than, e.g., algorithm analysis.
-
Value your number of citations above your number of publications.
Bioinformatic is an excellent field for attracting high numbers of
citations in the scientific literature. For instance, only a small proportion of papers in the journal Genome Research are devoted to describing bioinformatic
software and/or web servers, but they account for 8 of the journal’s 12 most-cited papers (out of 2,663 publications as of 1 January 2009); the 12 are
just the papers published in that journal that have been cited at least 500 times. Also, bioinformatic has some of the biggest citation monsters in all
of science, such as the 1994 paper on Clustal-W (Thompson et al. 1994) with over 25,000 citations, despite the fact that the journals with the highest
impact factors (e.g., Nature and Science) do not publish this kind of paper.
By comparison, no 1994 paper from any discipline in Nature or Science has over 6,000 citations, and only four of the 5,858 papers published that year in those journals have over 3,000 citations. One lesson that I take from
comparing these numbers is that citation counts for bioinformatic papers are not directly comparable to those for papers that present scientific discoveries.
Moreover, even among bioinformatic papers, citation counts are by no means guaranteed to provide an accurate assessment of a paper’s impact. However,
citation counts have the advantage of being objective and easily obtained, and I believe that by observing characteristics shared by the most highly cited
bioinformatic papers, one can glean valuable guidelines for structuring research programs.
Collaborate, and do it with great collaborators.
In my experience, collaborative projects are the best way to focus all of the necessary expertise
on a biological question that has a major bioinformatic component. Also, it is essential (but not always easy) to find collaborators with whom you can be
productive. Here is a list of a dozen great collaborators that basically traces my career in bioinformatic.
If you get an opportunity to work with one of these people, I recommend that you jump at the chance. Listed in approximately chronological order they are:
Gene Myers, David Lipman, Bill Pearson, Ross Hardison, Richard Gibbs, Eric Green, Ladeana Hillier, David Haussler, Jim Kent, Mathieu Blanchette, Adam Siepel,
Tom Pringle, Bill Murphy, and Stephan Schuster. I also have a list of collaborators to avoid. Maybe someday over a beer I’ll give you those names.
Do not expect a warm welcome from everyone.
Some biologists will welcome you with open arms, some will be pleasant but too busy, and some will
resent your trying to get lots of money to work on “their” problem or experimental system. Issues may arise because a traditional biologist thinks of
bioinformatic as a routine service.
Once, on an NIH proposal-review panel, I remarked that a bioinformatic person applying for funding was not listed
as a co-author on some of the papers where he claimed to be a collaborator. A biologist on the panel replied, “I never put my bottle-washers on my papers.
” My first exposure to real hostility in this field came as quite a shock, because my initial collaborations had been so agreeable.
-
Be a good collaborator.
Everyone knows how to do this. Basically, it is just what you learned in kindergarten: maintain humility and a sense of humor,
do more than your share, and deliver on time. However, not everyone adheres to these rules; you probably remember that “other list” mentioned at the end
of principle 3.
Distribute and maintain software and/or run web servers that you personally continue to use.
When I look for the characteristics that
distinguish the highly cited papers describing bioinformatic software from the less popular ones, this pops out as the strongest correlation.
Writing software to be used only by collaborators or customers, or for a task that won’t interest you next year, just does not seem to work well;
you may be able to get a publication out of it, but in 10 years it will probably look like a waste of time as judged by citation count.
Alternate between working on specific datasets and writing general-purpose software.
This principle is implied by #6. In computer science jargon,
the distinction is between an instance of a problem and a full-fledged computational problem. Thus, the folks responsible for Clustal have also written
biology papers about protein families, where it just so happens that they used Clustal to do the analysis. Incidentally, papers that focus on important
datasets are how you can become a media star (another potential objective function).
-
Write some of your own software.
This is the most controversial of my suggestions. A bioinformatic leader about half my age once expressed
amazement that I still write programs. However, relegating all programming to others causes problems, both for working on a specific dataset and for writing
general-purpose software. When you get an idea for a program adjustment that might work better for the current dataset, you won’t want to wait until your
student comes to the office to find out.
More importantly, when a change is needed in software you have been distributing for five years, you won’t want
to be without the services of someone who knows the code.
-
Don't give up.
I'm suggesting that you write papers, look at them in 5-10 years to see how many times they were cited, and adjust your research
program accordingly. Obviously, this takes time. Fortunately it gets easier as you learn more biology, develop a pool of collaborators, and get a
reputation for playing fair.
-
Be excited about your work.
This is essential for maintaining a long-running research program. We all know that “burn-out” happens. The trick
is to see when it is headed your way and make the sacrifices needed to avoid it. Over the years, different strategies have worked for me; below I
outline my current approach.
Click here for More information
````````````````````````````````````````````````````````````````````````````````````````````````````
NGS Data Analysis Softwares
Click Below FOR MORE POSTS
Click here for Bioinformatic MCQs
Click here to Download Bioinformatic BOOKS
Become a biologist.
This message comes directly from a keynote address given by David Botstein at the 1997 RECOMB meeting. My interpretation of the adage is to organize one’s value system so that the driving principle becomes solving biological problems, r ather than looking for biological problems to illustrate research in other areas.
That is, be driven to solve problems concerning cancer, AIDS, ageing, development, evolution, etc., not by guidelines used in other disciplines, such as "my proof is harder than your proof".
However, this is still quite different from what I call the traditional way of doing biology, which is to work on the same problem for years, applying any techniques (bioinformatic, sequencing, mouse transgenics, etc.) needed to get answers.
Sometimes scientists who take that approach become experts in a required technique, but often they assign the task to a lab member or buy the service from a vendor.
As a bioinformatic specialist, you will likely not work that way. Instead, you might bring your acquired tool set to various biological problems with the arrival of a new collaborator, set of data, or technology. For instance, your institution might buy a new kind of sequencing instrument, which would permit your local cancer biologist to address novel hypotheses if only she could find someone to analyze the sequence data. ( That would be you.)
Becoming a biologist allows you to anticipate which computational challenges are most worthy of a substantial investment in software development. Moreover, communications with collaborators become much easier.
Finally, I predict that as you get older, cancer and other aspects of human health will come to seem more interesting than, e.g., algorithm analysis.
Value your number of citations above your number of publications.
Bioinformatic is an excellent field for attracting high numbers of citations in the scientific literature. For instance, only a small proportion of papers in the journal Genome Research are devoted to describing bioinformatic software and/or web servers, but they account for 8 of the journal’s 12 most-cited papers (out of 2,663 publications as of 1 January 2009); the 12 are just the papers published in that journal that have been cited at least 500 times. Also, bioinformatic has some of the biggest citation monsters in all of science, such as the 1994 paper on Clustal-W (Thompson et al. 1994) with over 25,000 citations, despite the fact that the journals with the highest impact factors (e.g., Nature and Science) do not publish this kind of paper.
By comparison, no 1994 paper from any discipline in Nature or Science has over 6,000 citations, and only four of the 5,858 papers published that year in those journals have over 3,000 citations. One lesson that I take from comparing these numbers is that citation counts for bioinformatic papers are not directly comparable to those for papers that present scientific discoveries.
Moreover, even among bioinformatic papers, citation counts are by no means guaranteed to provide an accurate assessment of a paper’s impact. However, citation counts have the advantage of being objective and easily obtained, and I believe that by observing characteristics shared by the most highly cited bioinformatic papers, one can glean valuable guidelines for structuring research programs.
Collaborate, and do it with great collaborators.
In my experience, collaborative projects are the best way to focus all of the necessary expertise on a biological question that has a major bioinformatic component. Also, it is essential (but not always easy) to find collaborators with whom you can be productive. Here is a list of a dozen great collaborators that basically traces my career in bioinformatic.
If you get an opportunity to work with one of these people, I recommend that you jump at the chance. Listed in approximately chronological order they are: Gene Myers, David Lipman, Bill Pearson, Ross Hardison, Richard Gibbs, Eric Green, Ladeana Hillier, David Haussler, Jim Kent, Mathieu Blanchette, Adam Siepel, Tom Pringle, Bill Murphy, and Stephan Schuster. I also have a list of collaborators to avoid. Maybe someday over a beer I’ll give you those names.
Do not expect a warm welcome from everyone.
Some biologists will welcome you with open arms, some will be pleasant but too busy, and some will resent your trying to get lots of money to work on “their” problem or experimental system. Issues may arise because a traditional biologist thinks of bioinformatic as a routine service.
Once, on an NIH proposal-review panel, I remarked that a bioinformatic person applying for funding was not listed as a co-author on some of the papers where he claimed to be a collaborator. A biologist on the panel replied, “I never put my bottle-washers on my papers. ” My first exposure to real hostility in this field came as quite a shock, because my initial collaborations had been so agreeable.
Be a good collaborator.
Everyone knows how to do this. Basically, it is just what you learned in kindergarten: maintain humility and a sense of humor, do more than your share, and deliver on time. However, not everyone adheres to these rules; you probably remember that “other list” mentioned at the end of principle 3.
Distribute and maintain software and/or run web servers that you personally continue to use.
When I look for the characteristics that distinguish the highly cited papers describing bioinformatic software from the less popular ones, this pops out as the strongest correlation. Writing software to be used only by collaborators or customers, or for a task that won’t interest you next year, just does not seem to work well; you may be able to get a publication out of it, but in 10 years it will probably look like a waste of time as judged by citation count.
Alternate between working on specific datasets and writing general-purpose software.
This principle is implied by #6. In computer science jargon, the distinction is between an instance of a problem and a full-fledged computational problem. Thus, the folks responsible for Clustal have also written biology papers about protein families, where it just so happens that they used Clustal to do the analysis. Incidentally, papers that focus on important datasets are how you can become a media star (another potential objective function).Write some of your own software.
This is the most controversial of my suggestions. A bioinformatic leader about half my age once expressed amazement that I still write programs. However, relegating all programming to others causes problems, both for working on a specific dataset and for writing general-purpose software. When you get an idea for a program adjustment that might work better for the current dataset, you won’t want to wait until your student comes to the office to find out.
More importantly, when a change is needed in software you have been distributing for five years, you won’t want to be without the services of someone who knows the code.
Don't give up.
I'm suggesting that you write papers, look at them in 5-10 years to see how many times they were cited, and adjust your research program accordingly. Obviously, this takes time. Fortunately it gets easier as you learn more biology, develop a pool of collaborators, and get a reputation for playing fair.
Be excited about your work.
This is essential for maintaining a long-running research program. We all know that “burn-out” happens. The trick is to see when it is headed your way and make the sacrifices needed to avoid it. Over the years, different strategies have worked for me; below I outline my current approach.
Click here for More information
````````````````````````````````````````````````````````````````````````````````````````````````````