Transfer Large Files on Web to Cluster
When you find large files on the web and need to transfer to a computer cluster, what do you usually do? You can download the files to your desktop first, then transfer them to the cluster using Globus online. It works, but it is a cumbersome 2-step transfer and requires space on your desktop too. Once transferred, you need to delete the files. In this post, I am going to present two methods to simplify this process.
Mount Your Cluster Account on Mac Desktop
Install FUSE plug-in
We are going to use SSHFS to mount drives. To do it, you need both FUSE plug-in (OSXFUSE) & SSHFS. Both programs can be obtained here. Once you installed the program, you will see an icon for OSXFUSE on system preference window.
Create a Directory to Mount the Drive
You can create a directory any where you want. Let’s create a folder called “Cluster” on your desktop.
Run a Command in Bash
Open terminal, and type
If the plug-in is properly installed, you should see
general options: -o opt,[opt...] mount options -h --help print help -V --version print version ..... ..... no mount point
Then type the following to mount
>sshfs email@example.com:/path/to/cluster/directory /path/to/Desktop/directory
Make sure you have the right path for both the cluster and local directory. If all is correct, you are prompted for a password.
If it is successfully mounted, the folder you created on the desktop will change appearance once you mount the drive to it.
Caution: You mount the drive always on an empty folder. Anything in the folder will not be accessible once you mount another drive to it. For example, if you mount a drive on Mac HD, you will likely lose access to most of files, and crash the system.
Try Saving Something
Save the file you want to save by clicking “Save Link As…” You then select the cluster folder on your desktop. It will start saving the file in your cluster account. The image below is an example for downloading compressed fastq files from illumina website.
When you mount the drive, make sure to use an empty directory. If you use a directory that contains files, you will lose access to them. If you accidentally mount on the wrong place, you will need to unmount the drive.
Unmount the Drive
If you made a mistake mounting the drive in a wrong directory or the mounted drive stops responding, you can unmount the drive and remount it. To unmount,
Alternatively, Use a Web Browser on your Cluster to Save
It is not uncommon for unix/unix-like clusters to have web browsers these days. I don’t usually use them to browse the internet because it is kind of slow. However, it requires no installation of software or mounting disk to save files from the web directly to your cluster account. Some people may prefer this.
Ask your Admin to Find which Browser is Available
There are a few common web browsers for linux system. Here is a list of top 10 browsers. In my institution, I found firefox is available on the cluster. To use it, I log in using ssh -X then type firefox. A window will pop up for firefox. Once the window is open, you can use it just like your browser on your local machine, but with slower speed. Because you are using a browser through a cluster, you can save web files directly to your account.
Both methods work pretty well to transfer large files from the web. In my case, the second method was faster than the first method. You can try it and decide which method you like.
Addendum on Mar 10th, 2014
One of my coworkers suggested a Mac application for mounting drives called MacFusion. It is easy to set up. Only things need to be preinstalled are again these two programs.
1) FUSE for mac os X
Download from here
Set up needs pretty much the same information as above and most of people will have no problem using them if above methods are worked.