Following the scientific literature: A personal practical guide for young computational biologists

by Igor Ulitsky & Ron Shamir

The goal of this guide is to describe why, when, where and how can you follow the most up-to-date science of interest and what papers/journals you should follow. The guide is biased towards the fields of genomics/systems biology, but all the practical advice and the guidelines for the top biology and computational biology journals apply also if you’re into SNPs/proteins/phylogeny/etc. This guide also assumes that you’re passionate about biology and that your fields of scientific interest are not very narrow.

This is a personal perspective, and as such it is subjective. While many will agree with some of our guidelines and literature selection -  probably none will agree with everything, and many will disagree with a lot of what we say.  It is aimed primarily for students and young researchers making their first steps in following the literature. We hope that this guide will ease your first steps, and after following the literature for a while, you will have formed your own different views and preferences (and eventually write your own better guide ;-).

 

Why should I constantly follow scientific literature?

There are simple pros and cons to devoting time to regularly getting updated on what’s going on in other labs worldwide:

·         Cons

O   It takes time. How much time? Depends on how wide (how many fields/journals will you follow) and how deep (how many papers in each field) you want to be.

O   Most of the papers you will read will be eventually useless, as they will either be very poor, or you will never use the information you read.

The good news about these problems is that the amount of time it will take you to (a) decide if a paper is good; or (b) read a paper properly; drops dramatically over time, particularly if the paper is in the field you’re working on (probably about 5- to 10-fold over 5 years of a PhD).

·         Pros

o The more you read – the more knowledgeable you become about (a) the problem you’re working in your current project; (b) the frontiers of science that you should work on, and may form your next project; (c) the current status of science in general, approaches used, common difficulties etc. Thus, being up-to-date will save you a lot of time that you may waste “reinventing the wheel”:

§  working on problems already solved

§  thinking about problems that are solved/ will be solved very soon by others

§  writing your own code for problems already solved by tools available off the shelf.

o  Having a good grasp of your field is essential for writing fellowship requests, paper introductions, grant proposals etc. (as referring to up-to-date studies is very important)

o The more you read, the better the flow of your projects becomes, the easier it is going to be to write your own paper(s). The ability to initiate and carry out a proper compbio project (choosing a problem, solving it, doing controls, finding new biology)  will improve dramatically if you read papers about projects with similar structure (even if the problem they solve is entirely different).

o Reading many scientific papers improves your critical judgment – you will become much more critical about both the work of others and about your own work (which is always a good thing).

o (If you follow Science and Nature closely and read things outside your field). You will read about plenty of interesting and sometimes exotic stuff about neuroscience/education/behavior, that do not make it to regular newspapers (monkeys count better than graduate students, people holding hot coffee cups are more likely to be nice to etc.)

When should I get updated / read?

The short answer is all the time. Doing a routine update, e.g., once you finish a project, or once a month, is something that sounds appealing, but is impossible in practice. You will probably never get to it (there is always something that seems more urgent than spending two weeks just reading papers), and if you do find the time, you will encounter too many papers at once, and miss most of the really useful material, nothing will “sink in” and you will eventually miss out on most of the Pros described above. Therefore, it is best to screen for interesting stuff and read what passes your ‘interesting’ filter all the time (i.e., at least once a week). If you’re really flooded with work, still mark/print the paper if it looks interesting, and get back to it when you have time.

Where to get updated?

There are three main places you can get updates from:

·      PubMed – where you can get updates both based on specific keywords, by journal or by a specific author (e.g., the head of a lab that you know is doing stuff similar to yours).

·      Journal publisher’s websites – usually these get updated both when a new journal issue is out and when new “advance online” papers become available (it sometimes takes months before these find their way into an issue/print)

·      ISI Web of Science – a commercial site that TAU has a license to use - http:// www.isiknowledge.com/ .

·      Science news websites, such as GenomeWeb, and news@nature.

·        Podcasts/Webcasts on science are available from several journals and can be an interesting thing to listen to while commuting (not driving!), jogging etc. Cell, Science, Nature and Science Times.

I will focus below on how to get updates from the first two sources, but the methods (i.e., RSS) apply to the third as well.

How to get updated?

There are three simple methods for regularly receiving updates about papers being published. The best is probably to use a mixture of these methods, as each has different strengths, and there are some journal-specific aspects (see table below). The three technical ways of getting updates are:

1.      RSS – the best (in my opinion) option to get updates from most journals and from PubMed. Background on RSS is found here. The general idea is that you need to use an RSS reader, which can be either a standalone software (most e-mail software support also RSS reading) or web-based such as Google Reader (http://www.google.com/reader). You then subscribe to feeds, that are usually marked on the web pages by these icons:  or . A feed is basically an XML file that contains some info. The RSS reader regularly checks this XML file and if something changes there – updates the info in your reader. Thus, by hourly/daily/weekly browsing through your unread items you can efficiently get all the updates from a large number of websites. You can also mark items that interest you, share them with other people (if you both use Google reader), easily e-mail an item etc. Marking items that interest you is particularly useful if you want to mark an interesting paper to read it later (e.g., because the printer doesn’t work, because the university forgot to renew a journal or because a formatted PDF of the paper is not yet available). Importantly, as the usage of RSS is very widespread on the web, you can use RSS to follow other interesting stuff: news (e.g., the latest culture updates from CNN or Ynet), blogs, Facebook, the stock market, billboards etc.

2.      E-mail updates –also can be used to get updates about specific keywords (from PubMed) or from specific journals (Table of Contents, or TOC alerts). The e-mail updates have some cons compared to RSS:

a.      Your Inbox gets cluttered

b.      Each e-mail will usually contain 10s of papers, out of which only one is really of interest for you. If you want to mark it (e.g., to read later), share it or e-mail it to someone else, it is more difficult to do so.

c.       In most journals, it is possible only to get e-mail updates about the regular issues, and not about “advance online” papers.

3.      Visiting the publisher website once a week/2 weeks/month – this option is difficult to stick to and is not recommended.

Using these three options are there two main methods for getting updates:

1.      Get updates about every new paper coming out in journal X: This is the best option for the top journals or journals specifically in your main topic of interest. Even if the paper is not directly what you’re looking for, it is possible that it is relevant or can ignite your imagination. This can be done by either adding an RSS feed from the journal (listed below, or from the web) or signing up for an e-mail alerts on the journal’s homepage (all leading journals have this option these days).

2.      Get updates about every paper about subject Y: This is a good option if you want to be sure you read everything about a specific topic (microRNA function, motif finding, protein interaction networks, ChIP-seq, metagenomics etc.). This is recommended, as some papers relevant to you may appear in good biomedical journals that are too tedious to follow, as they very rarely publish compbio-relevant stuff (Cancer Cell, Cancer Research, Blood, Immunity, NEJM, JAMA, Genetics etc.) The way to do this is by performing a search for your keyword in PubMed, then selecting “Send to->” RSS Feed or E-mail. This will result in a daily digest that you will get as soon as some papers with this keyword are added to the PubMed index.

Put effort into your query! Otherwise you will get a lot of irrelevant papers. Take into account that no matter how hard you try, if you search by a common keyword (e.g., microRNA function), >50% of the papers will be coming from very (very) small journals and will probably be irrelevant. On the other hand, getting such updates will usually make sure you’re not missing any publication relevant to you. Samples queries are “Ideker T[au]”, “microRNA AND (function OR evolution OR expression)” (selecting also Limits:English langage”) or “ChIP-seq”.

3.      Get updates about every paper that cites paper Z: There are a number of ways to do this. One is to find the paper in the ISI Web of Knowledge, and then create a ‘Citation alert’, which can be directed either to your e-mail or RSS.

What journals should I follow?

This is the most complex part to answer. For starters you should follow Science/Nature/ PLoS Biology & PLoS Computational Biology (see table for details). Once you have handled that for 1-2months, you’re ready to expand. On the one hand you should follow journals that publish things that interest you, but on the other hand, reading some other journals will expand you fields of interest. The answer is probably that you should follow each of Cell/Science/Nature/PLoS Biology, as these are the top biology journals, which publish almost entirely excellent science. You should try and read 1-2 papers from each of these every month or so (most will not be in your direct field, just pick ones that you think could be interesting).  If you’re into genomics/systems biology, you should also follow the “top layer of genetics/genomics” listed below. Once you see you can handle this volume – add to your RSSs/E-mails the other journals of genomics/genetics listed below. In addition, you should follow the three leading compbio journals (listed below), and, once your screening ability and critical judgment improve, you can also scan through BMC Bioinformatics and PLoS One.

Section

Journal (hyperlinked)

Current issue feed

New articles feed

Volume

Notes

Science / Biology in general:

Cell

RSS

 

Bi-weekly. Few papers but all describe landmark studies.

Resources' articles are particularly compbio-useful, as they usually describe huge datasets

Science

RSS

RSS

Weekly. Only 3-4 papers per week are molecular biology papers, only about 0-1 per month are compbio.

E-mail table of contents are recommended for Nature & Science, since the News and Correspondence sections are also quite interesting sometimes

Nature

RSS

RSS

Weekly. 5-6 papers on molecular biology

PLoS Biology

RSS

 

A bi-weekly digest of papers is sent. 5-6 weekly papers on molecular biology. 0-1 per month are compbio.

 

Top layer of Genetics / Genomics

Nature Genetics

RSS

RSS

Monthly, about 20-30% of the papers are relevant, unless you're interested in association studies

 

Nature Biotechnology

RSS

RSS

Monthly, only 2-3 papers are compbio relevant

 

Molecular Cell

RSS

 

Bi-weekly

'Resources' papers are particularly useful

Genome Research

RSS

RSS

Monthly, most is compbio relevant

 

PNAS

RSS

RSS

Weekly, 4-5 papers per week are compbio relevant

PNAS contains different articles from very different fields of science, many of which are usually irrelevant to us (e.g., Microbiology etc.). Therefore, it is better to subscribe to the e-mail alerts, in which the table of contents is broken into sections (and then just read the parts of the table of contents that interest you).

Molecular Systems Biology

RSS

 

Bi-weekly, 5-6 articles, mostly compbio-relevant

 

Reviews

Nature Reviews Genetics

RSS

 

Monthly

 

Nature Reviews Molecular Cell Biology

RSS

 

Monthly

 

Genetic & Genomics - optional

PLoS Genetics

RSS

 

A weekly digest of papers, few compbio-relevant

Mostly papers on genetics, but some are of more broad interest

Trends in Genetics

RSS

 

Monthly

Some papers are 'review-like', while other describe small 'peculiarities' found, that are usually difficult to explain mechanistically, and sometimes are very interesting

Journal of Biology

RSS

 

Very rare

The top layer of the BMC journals. Very few papers, but frequently very interesting ones

Genome Biology

RSS

 

Not very systematic, you get updates occasionally. Many compbio paper, that are usually much better than the ones in BMC Bioinformatics

BMC journals publish papers first in an unformatted way- 50+ pages PDFs, that are quite unreadable and not environmentally friendly. It is usually better to wait 2-3 weeks before printing/reading the paper

BMC Genomics

RSS

 

Genes & Development

RSS

RSS

Monthly

 

Nucleic Acids Research

RSS

 

Monthly

Publishes a very large number of papers (40-50). Usually ~5 are compbio-relevant, particularly if you're into sequence motifs or RNA

Core of CompBio journals

PLoS Computational Biology

RSS

 

A weekly digest of papers

 

Bioinformatics

RSS

 

Monthly

The e-mail table of contents alerts are broken into sections, making them easier to read

Journal of Computational Biology

RSS

 

Monthly

More computationally-oriented than the others

Additional optional CompBio journals

BMC bioinformatics

RSS

 

Not very systematic, you get updates almost daily

Very large volume, frequently very low quality, but occasional gems

PLoS ONE (Computational Biology section)

RSS

 

A weekly digest

Similar to BMC bioinformatics, but even fewer gems

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

RSS

 

4 issues a year

Computational orientation

Journal of Bioinformatics and Computational Biology (JBCB)

 

 

About 3 issues a year

Computational orientation

Briefings in Bioinformatics

RSS

 

1-3 per month

Reviews

Misc. journals

BMC Systems Biology

RSS

 

Not very systematic, you get updates occasionally

More focused on the biophysical side of systems biology

Cell Stem Cell

RSS

 

 

Top resource on stem cell biology

 

Lots of RSS feeds are available here: http://barf.jcowboy.org/

The RSS feeds on all Nature journals: http://www.nature.com/webfeeds/index.html

What should I avoid?

A very common scenario is that one decides to start following journals, subscribes to too many RSS feeds/e-mail alerts, gets flooded with too many updates to follow, and abandons following literature altogether.  On the other hand, if you subscribe to just 1-2 journals and use RSS, you will rarely get any updates, which will probably cause you to forget/stop operating the RSS reader. The best strategy is therefore to subscribe to a PubMed feed(s) with specific keyword relevant to your research (resulting in about 10-20 updates per week) + 5-10 top journals (e.g., Cell/Science/Nature/Nature Genetics/Genome Research/PLoS Biology/PLoS Computational Biology/Bioinformatics). If you see that you can handle the inflow of updates and read some papers for about 2 months, and still have appetite for some more science – gradually increase the number of keywords/journals you follow.

 

Questions? ulitskyi@tau.ac.il

 

Hit Counter
Free Hit Counters