wiki:cypress/FileTransfer

Version 18 (modified by cbaribault, 5 years ago) ( diff )

In scp command examples, changed "remoteuser" to "tulaneID" - consistent with elsewhere in wiki.

File Transfer

Transferring Files

Windows 10, WSL, Linux, and Mac Terminal Window

You may transfer files between your workstation and Cypress on the command line using the scp command. This command behaves much like the basic Linux cp command, except you may use a remote address as the source or destination file. The syntax is as follows:

scp source_file destination_file

The following command will copy the file testfile from the /home/tulaneID/ directory on the remote server cypress1.tulane.edu to your workstation's local directory "." (a period represents the current working directory).

user@localhost> scp tulaneID@cypress1.tulane.edu:/home/tulaneID/testfile .

To copy a directory along with all its contents you will need to add the -r recursive flag. The following command will copy the simdata directory and all its contents to your local machine.

user@localhost> scp -r tulaneID@cypress1.tulane.edu:/home/tulaneID/simdata .

Tulane Box Accounts

In your Cypress session, you can transfer files between your Tulane Box account and Cypress on the command line using the rclone command, which is available via the module rclone. (See Option 1 of 2 below and Module command).

In order to use rclone on Cypress, you must first create the config file ~/.config/rclone/rclone.conf on your Cypress account. You can generate your rclone.conf file either directly in your Cypress session or by creating it on your local machine and then copying it to your Cypress account via one of the following options.

Option 1 of 2: Generating Your rclone.conf File In Your Cypress Session

  1. Login to Cypress from your local machine with X11 forwarding. (See X11 Forwarding.)
  2. In your Cypress session enter the following commands.
    module load rclone
    rclone config
    
  3. At the prompt …No remotes found - make a new one…, enter n for New remote.
  4. At the prompt name>, choose a meaningful name, e.g, TU-Box, to be used for future connection to your Tulane Box account from your Cypress command line session.
  5. At the prompt Type of storage to configure…, enter 6 for Box.
  6. At the prompt …Box App Client Id…, respond by pressing Enter for the default.
  7. At the prompt …Box App Client Secret…, respond by pressing Enter for the default.
  8. At the prompt Edit advanced config? (y/n)nnn, enter n for No.
  9. At the prompt Remote config - Use auto config?…, enter y for Yes.
  10. Observe the message ending in Waiting for code …, and wait for a separate browser window to open on your local machine with a Box web page. Depending on your network latency, you may experience a delay of 1 minute or more. In the Box web page, click on the link Use Single Sign On (SSO).
  11. In the next Box web page, enter your full Tulane email address and click on Authorize.
  12. This will take you to the Tulane login web page, wherein you can complete your Tulane login information and click on Sign in.
  13. This final web page should show the message Success! - All done. Please go back to rclone, at which point you can close the browser window and return to your Cypress session command line window.
  14. Back in the command line window, observe the new message Got code followed by your new token information ending with a prompt to which you can respond y for Yes this is OK.
  15. The resulting response shows the list Current remotes:, including your new Box remote, e.g., TU-Box followed by a prompt to which you can respond q for Quit config.
  16. Observe the newly created config file ~/.config/rclone/rclone.conf
    [tulaneID@cypress1 ~]$ ls -l ~/.config/rclone/
    total 8
    -rw------- 1 tulaneID group 233 Jun 29 20:00 rclone.conf
    [tulaneID@cypress1 ~]$ cat ~/.config/rclone/rclone.conf
    [TU-Box]
    type = box
    token = {"access_token":"ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567","token_type":"bearer","refresh_token":"ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890abcdefghijklmnopqrstuvwxyz12","expiry":"2020-06-29T21:03:28.009439309-05:00"}
    
    
    at which point you're ready to run rclone on Cypress. (See Running rclone on Cypress below.)

Option 2 of 2: Generating Your rclone.conf File Using Your Local Machine

  1. Download, install, and configure rclone on your local machine by following the instructions for Windows, Mac, or Linux as appropriate starting from the rclone installation site.
    1. During the above configuration, you will be prompted for a name for the Box remote system, e.g. TU-Box.
    2. Make note of your response for the name of the Box remote system for future reference to be used in your invocations of the rclone command.
  2. Once you've configured rclone on your local machine, locate the resulting file rclone.conf on your local machine.
    1. On Mac and Linux, e.g., look for the file ~/.config/rclone/rclone.conf
    2. On Windows 8 (or under) , e.g., look for the file %userprofile%\.config\rclone\rclone.conf
  3. Create as needed the directory ~/.config/rclone on your Cypress account.
  4. Copy the resulting file rclone.conf from your local machine to your Cypress directory ~/.config/rclone. See File Transfer.
  5. For security purposes, be sure to confirm that only you have exclusive read/write privileges to the resulting file ~/.config/rclone/rclone.conf on Cypress. See the ls and chmod commands under Linux Commands

Once you have created the config file ~/.config/rclone/rclone.conf on Cypress, you can run rclone via the following.

Running rclone on Cypress

  1. To list the entire contents in your Box account with configured Box remote name TU-Box
    rclone ls TU-Box:
    
  2. To list the contents of a top-level folder in your Box account with configured Box remote name TU-Box
    rclone ls TU-Box:<top-level-folder>
    
  3. To copy the contents of a top-level folder in your Box account with configured Box remote name TU-Box to your current Cypress working directory…
    rclone copy TU-Box:<top-level-folder> .
    
  4. To copy the file test.txt in your current Cypress working directory to a top-level folder in your Box account with configured Box remote name TU-Box.
    rclone copy test.txt TU-Box:<top-level-folder>
    
  5. To simply list the available subcommands under rclone.
    rclone --help
    

Graphical Software

There are many graphical file transfer solutions available. The following are the three most popular and are fairly intuitive. Be sure to set each to connect to Cypress using the Secure File Transfer Protocol (SFTP).

Filezillla is available on all platforms. Be careful when downloading and installing as the hosting site, sourceforge, has begun to bundle bloatware with its downloads. FileZilla

Fetch is a full-featured file transfer client for Mac and is free to the academic community Fetch

WinSCP is a free Windows client. WinSCP

Note: The installer may install other software that you may not want to. So be careful with the messages.

Example

Let's try out FileZilla

Storage on Cypress

Every Cypress user has two locations in which to store data: A small, high security, low performance, personal home directory and a large, secure, group shared Lustre directory.

Storage: home directory

Your home directory on Cypress is intended to store customized source code, binaries, scripts, analyzed results, manuscripts, and other small but important files. This directory is limited to 10 GB (10,000 MB), and is backed up. To view your current quota and usage, run the command:

quota -s -f /home

Please do not use your home directory to perform simulations with heavy I/O (file read/write) usage. Instead, please use your group's Lustre project directory.

Storage: Lustre group project directory

Cypress has a 699 TB Lustre filesystem available for use by active jobs, and for sharing large files for active jobs within a research group. The Lustre filesystem has 2 Object Storage Servers (OSSs) which provide file I/O for 24 Object Storage Targets (OSTs). The Lustre filesystem is available to compute nodes via the 40 Gigabit Ethernet network. The default stripe count is set to 1.

Allocations on this filesystem are provided per project/research group. Each group is given a space allocation of 1 TB and an inode allocation of 1 million (i.e. up to 1 million files or directories) on the Lustre filesystem. If you need additional disk space to run your computations, your PI may request a quota adjustment. To request a quota adjustment, please provide details and an estimate of the disk space used/required by your computations. Allocations are based on demonstrated need and available resources.

The Lustre filesystem is not for bulk or archival storage of data. The filesystem is configured for redundancy and hardware fault tolerance, but is not backed up. If you require space for bulk / archival storage, please contact us, and we will take a look at the available options.

Your group's Lustre project directory will be at:

/lustre/project/<your-group-name>

"your-group-name" is your Linux group name, as returned by the command "id -gn". Your group is free to organize your project directory as desired, but it is recommended to create separate subfolders for different sets of data, or for different groups of simulations.

To view your group's current usage and quota, run the command:

lfs quota -h -g `id -gn` /lustre

To view your own usage, you can run:

lfs quota -h -u `id -un` /lustre

High Performance Data transfer

For high speed transfer of large files (1GB or larger), Cypress is currently equipped with the data transfer tool bbcp. An excellent treatment on the use of BBCP can be found at http://pcbunn.cithep.caltech.edu/bbcp/using_bbcp.htm

Next Section

Working on a Unix System

Note: See TracWiki for help on using the wiki.