7. How to Transfer Files to/from the Yen Servers

In your data processing pipeline, you will need to transfer data to the Yen servers to carry out your analysis and then transfer the resulting files back to your local machine.

Using scp

Transfer files to the Yens

To transfer a file or a few files matching a search criteria, use:

$ scp mydata.txt <SUNetID>@yen.stanford.edu:/zfs/projects/students/<my_project_dir>

to transfer the file mydata.txt to your project space on the Yens. You will be asked for your password and Duo authentication every time you use scp (because scp command uses ssh to transfer files).

If you want to transfer all csv files from a particular directory, use:

$ scp *.csv <SUNetID>@yen.stanford.edu:/zfs/projects/students/<my_project_dir>


Let’s transfer our R script example from the local machine to the yen servers. On your local machine, in a terminal or Git Bash window, run:

$ cd ~/Desktop/intro-to-yens
$ scp swiss-parallel-bootstrap.R <SUNetID>@yen.stanford.edu:~

where ~ is your Yen home directory shortcut. Enter your SUNet ID password and Duo authenticate for the file transfer to complete.

Transfer folders to the Yens

On your local machine, open a new terminal or Git Bash window and navigate to the parent directory of the folder that you want to transfer to the Yens with the cd command.

For example, if you want to copy a folder from your Dropbox, use (on your local machine):

$ cd ~/Dropbox

On Windows, another way is to navigate to the parent directory using File Browser then right click and choose “Open GitBash here” to open a new Git Bash window in the directory that you navigated to via the file browser.

Once you are in the parent directory above the one you want to copy, run the following to copy the folder to the Yens:

$ scp -r my_folder/ <SUNetID>@yen.stanford.edu:/zfs/projects/students/<my_project_dir>

where -r flag is used to copy folders (recursively copy files), <SUNetID> is your SUNet ID and <my_project_dir> is the path to your project directory on ZFS or wherever you want to transfer the files to.

Let’s run a short example. We will create an empty folder called test_from_local that we will then transfer to the home directory on the Yens.

$ mkdir test_from_local
$ scp -r test_from_local/ <SUNetID>@yen.stanford.edu:~

Transfer files from the Yens

Similarly, you can copy files back from the Yens to your local machine. Open a new terminal and do not connect to the Yens. Navigate where you want to copy the files to with the cd command. Then, run

$ scp -r <SUNetID>@yen.stanford.edu:/zfs/projects/students/<my_project_dir>/results .

where we are copying the results folder on the Yen’s ZFS file system to wherever you are locally (. means copy here). If you are copying files and not directories, omit -r flag and for multiple files transfer, use the wild card * to match several files.

Using rsync

Alternatively, we can use rsync to transfer files to/from the Yens.

Transfer files to the Yens

To transfer a file (for example, myfile.csv from my local machine), use:

$ rsync -aP myfile.csv <SUNetID>@yen.stanford.edu:/zfs/projects/students/<my_project_dir>

You will be asked to enter your password and complete the two-step authentication process after this.

To transfer folders to the Yens

We can also add a recursive flag (-r) to rsync to transfer a folder to the yens:

$ rsync -aPr myfolder/ <SUNetID>@yen.stanford.edu:/zfs/projects/students/<my_project_dir>/myfolder

Transfer files from the Yens

To transfer a file (for example, myfile.csv from the yens to your local machine), use:

$ rsync -aP <SUNetID>@yen.stanford.edu:/zfs/projects/students/<my_project_dir>/myfile.csv .

To transfer folders from the Yens

We can also add a recursive flag (-r) to rsync to transfer a folder (myfolder) from the yens:

$ rsync -aPr <SUNetID>@yen.stanford.edu:/zfs/projects/students/<my_project_dir>/myfolder/ myfolder/

Using rclone to Google Drive

Another useful tool for data transfer is rclone which often outperforms rsync, but unlike Globus, it will allow us to transfer files to a myriad of locations, not just locations with endpoints such as Google Drive or Dropbox or the Cloud.

Using rclone locally

On Windows: download from here.

On Mac: install rclone with:

$ curl https://rclone.org/install.sh | sudo bash

Using rclone on the yens

We can utilize rclone to directly push files or directories from the yens into Google Drive (and other remote locations such as Amazon S3, Dropbox, etc).

$ ml rclone
$ ml

Currently Loaded Modules:
  1) rclone/1.60.0

Setting up rclone

Before we can push data from the yens to Google Drive, we need to configure rclone once.

$ rclone config

The configuration menu will be presented:

No remotes found - make a new one
n) New remote
s) Set configuration password
q) Quit config
n/s/q> n
name> nrapstinGoogleDrive

Select n to make a new remote and give it a name when prompted. For example, $USERGoogleDrive where $USER is your SUNet ID.

Next, select the number corresponding to Google Drive (the menu changes with rclone version so be careful to select the right remote).

18 / Google Drive
   \ (drive)
Storage> 18

When prompted for the next two options, leave them blank and press Enter.

Then the next menu asks to select permissions you want to give rclone. Choose 1 for full read-write access.

scope> 1

Then leave the next prompt blank and press Enter.

Choose n to not edit advanced config:

Edit advanced config? (y/n)
y) Yes
n) No (default)
y/n> n

Choose n again since we are working on the remote Yen server:

Remote config
Use auto config?
 * Say Y if not sure
 * Say N if you are working on a remote or headless machine
y) Yes (default)
n) No
y/n> n

Next, we need to finish configuring Google drive using the local machine. For that, you will need to install rclone locally and in the local terminal, run

rclone authorize "drive" "xxxxxxxxxxxxxxxxxxxxxxx"

where “xxxxxxxxxxxxxxxxxxxxxxx” is the config token that you see in the yen terminal from the previous step.

This will open up a local web browser in which you can authenticate into your Google Drive using your Stanford account.

Once you authorize rclone for access, Google Drive will give a code to paste back into the yen terminal window. Copy the code and paste it back into the yen terminal after config_token>.

Next, you will be asked if you want to configure this as a team drive. Press y if you are connecting to a shared Google Drive or press n if you are connecting to your Google Drive.

Configure this as a team drive?
y) Yes
n) No (default)

Finally, press Enter to complete the config.

y) Yes this is OK (default)
e) Edit this remote
d) Delete this remote

In the last prompt, hit q to quit. Now, rclone should be set up to push files from the yens to your Google Drive.

Using rclone

Here is a list of common rclone commands you can use to push files directly from the yens to your Google Drive.

To list remote connections where you can push:

$ rclone listremotes

Create a remote folder on Google Drive. Note this will make the folder within your Google Drive base folder.

$ rclone mkdir $USERGoogleDrive:GoogleDriveFolderName

Alternatively, you can specify the path to the new directory on Google Drive:

$ rclone mkdir $USERGoogleDrive:myFolder/subfolder/data

where I already had myFolder directory on my Google drive and within myFolder I have already created subfolder. This rclone command will make a new subdirectory data.

List contents of remote folder on Google Drive

$ rclone ls $USERGoogleDrive:GoogleDriveFolderName

To upload directory to Google Drive using copy:

$ rclone copy /Path/To/Folder/ $USERGoogleDrive:GoogleDriveFolderName/

where /Path/To/Folder/ is the path on the yens to the directory you want to upload.

Download from remote Google Drive to the yens:

$ rclone copy $USERGoogleDrive:GoogleDriveFolderName /Path/To/Local/Download/Folder

where /Path/To/Local/Download/Folder is the path on the yens (or local machine) where you want to copy files to.

See more details on official rclone documentation.