bigWig files

Genomics data are so abundant nowadays. The chances are someone has conducted similar experiments to yours. In fact, we should research hard before even thinking about running new samples because the last thing you want to know is that someone has published the data before you did.

Maybe one day your boss says he found an article xxx.  He says they did an interesting experiment and asked you to find out how the genes of interest behaved in their dataset.

So you went to the article. However, the article has no supplemental excel data sheet. You look hard and read the article to find out if they reported formatted data somewhere.

Only thing you see is GEO accession number of the study. So you search on google or went to GEO website. Then type the GSE accession number. You scroll down the page and found data are available on ftp server. But wait… it says bigWig file, what is this?

What is bigWig file?

On UCSC website it says following “The bigWig format is for display of dense, continuous data that will be displayed in the Genome Browser as a graph.”.

Not clear? bigWig files are indexed binary files which contain alignment data to the genome. It is similar to bed file which contains location of the genome where the read aligned to. Because it is indexed, it can be used to quickly find the location you want.

Visualize pile-up data on UCSF Genome Browser

To actually look at bigWig data, you don’t need programming skills. If you are using bigWig files found on GEO website, you DON’T need to DOWNLOAD the files. This is the key point-, UCSF GB can find the gene of interest very quickly and efficiently from bigWig files. Because it is indexed, the search is fast and no need to upload the entire file.

Step1: Get the URL for FTP server

When you download bigWig data, you click the FTP link on the GEO website. Screen Shot 2014-12-26 at 11.07.02 PM

You will be asked if you want to log in as guest or registered user. Log in as guest.  It will pop up a window with a folder which contains SRA files associated with the study. Open the folder, then right click (control + click for mac) and select “Get Info”.

Screen Shot 2014-12-26 at 11.13.38 PM

Now you copy the server URL.

Screen Shot 2014-12-26 at 11.26.43 PM

 Step2: Create A Track File

Track file is an essential file that will be used in the UCSC genome browser to hold the key information about the SRA files you want to visualize. To create, you open a text editor and type the following minimum information: track type & description, bigDataUrl. track type is “bigwig” and a concise and understandable sample name should be chosen for name. For description, you can add more detailed sample information, but you can leave as the same as sample name. Finally, bigDataUrl should be the ftp server address. See the example below.

Screen Shot 2014-12-29 at 12.13.44 PM

If you want to know more about track file. There is a  detailed tutorial for generating your own session here.

Step3: Upload A Track File

Once you created a track file, you need to upload it on the Genome Browser.

First, pick the right genome you want to use. In this example, I chose “insect” in clade, “D. melanogaster” in genome, and “BDGP R5/dm3” in assembly.

There are two ways to upload the file, either copy and paste the track file info into the “PASTE URLs” box or click “Choose File”, then “Submit”.  If there is no error in your file, you will see uploaded information in the window like below.

Screen Shot 2014-12-29 at 12.19.02 PM

Here I successfully uploaded the data for 6 samples. Then click “go to genome browser”.

Screen Shot 2014-12-29 at 12.21.11 PM

 Since I didn’t add any specific position of the genome for visualization, it uses the default position. You can change it by typing the actual position of the genome or simply your gene name of interest. I also note that there are so many things you can display, however many of them may not be necessary. Go change the items that you don’t want to see by changing the status of the item to “hide”, then click “refresh”.

I also want to point out that I used “dense” instead of “full” to show the aligned reads in the screen. To see the pile-up, just change it to “full” for the samples you want to see.

Step 4: Sharing it with your collaborators

What you see here can be shared with anybody you like. I would create an account if you don’t have any. Then log in from here.

Once you log-in, there are several options to share the session with someone else.

Screen Shot 2014-12-29 at 12.30.21 PM

 The first thing to save the session. Type name and click “submit”

Screen Shot 2014-12-29 at 12.32.38 PM

Now, you click “Email” to send the link by email. Alternatively, save the current setting to a local file. Then give the file to someone you want to share with. Once he/she gets the file, he/she goes to the same site, and just upload the file in the section “Use settings from a local file”. I prefer the second method, as sometimes when I used the old link, the results were not identical to the original ones.

About bioinfomagician

Bioinformatic Scientist @ UCLA

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: