This documentation is under active development, meaning that it can change over time as we refine it. Please email email@example.com if you require assistance.
Copying files to and from M3¶
To copy files to and from M3, see the appropriate section below.
The M3 staff like to use FileZilla. FileZilla is available for Windows, Mac OS and Linux. Many SFTP clients are available, the list depends on the platform you use. Here are a few others:
FileZilla - https://filezilla-project.org/download.php?show_all=1 (Only download from this page, other pages/sites have malware)
WinSCP - https://winscp.net
Cyberduck - https://cyberduck.io
The following instructions are based around FileZilla.
Navigate to https://filezilla-project.org/download.php?show_all=1 and click on
Downloadfor your operating system
Once downloaded, install the application on your machine.
On Windows the installer may also attempt to install WinZip and Opera. Make sure you deselect the checkbox if you don’t want these applications.
Start up FileZilla.
On Windows the look of the program will be different to the screen shots below but the location of buttons and fields is similar.
In FileZilla click on the
New Siteand create a new site for M3.
Complete the following fields and click on Connect.
Protocol: SFTP SSH File Transfer Protocol
Logon Type: Ask for Password
User: Enter your M3 username
Enter your password.
FileZilla will connect to your home account on M3. Assuming everything went well you can now drag and drop files into M3.
Obtain WinSCP: https://winscp.net
Once downloaded, install the application on your machine.
Setup your connection to MASSIVE as shown here:
Port number: 22
Username: Enter your M3 username
Password: Leave blank, you will be prompted for it.
4. After connecting to MASSIVE, your screen will be similar to that below. In this example the folders MASSIVE_INPUT and MASSIVE_OUTPUT have been created locally on the Windows machine and remotely on MASSIVE.
Now you can manually transfer files to and from MASSIVE using the interface. To automatically monitor folders for new files read below.
‘Watch’ a local folder and automatically SFTP files to MASSIVE
WinSCP can monitor the contents of a local folder and automatically SFTP new files across to MASSIVE. Note: this is ONE way, Windows to MASSIVE. For the reverse direction, see below.
Click on ‘Commands’, ‘Keep Remote Directory up to Date…’
2. Fill out the correct path for the source folder to watch and the destination folder
3. Click. ‘Start’. An initial synchronization of the folders will be completed and then when new files are added to the Windows folder, they will be SFTP’d to the MASSIVE folder.
‘Watch’ a MASSIVE folder and automatically SFTP files to a local folder
To automatically bring files to Windows from MASSIVE, WinSCP executes a Powershell script.
1. Click on ‘Commands’, ‘Static Custom Commands’, ‘Keep Local Directory up to Date…’
2. Setup the folders as required. In this example, MASSIVE_OUTPUT on MASSIVE and Windows is used. The synchronization interval can be set. This is how often WinSCP checks MASSIVE for new files and then SFTPs them to Windows.
3. Click ‘OK’. A Powershell window will then open. This process will continue until Ctrl+c is entered.
TIP: To monitor folders on both systems using WinSCP, start first ‘Keep Local Directory up to Date’ and then start ‘Keep Remote Directory up to Date’.
Globus can be used to transfer data at high speed in and out of MASSIVE. The Globus software is widely used to transfer data around the world between different endpoints. Endpoints are setup at a wide range of institutions and can also be created on your personal computer.
Update: Globus v5 has been installed on the MASSIVE data transfer nodes. The login process is the same, except the endpoint names have changed. The previous endpoints (massive#massive, massive#mern) will be decommissioned in the future. The main endpoint is now known as ‘MASSIVE M3’. For researchers connected to the Monash eResearch Network, massive#mern will be redeployed as ‘MASSIVE on MeRN’
Click on “Log In” to Globus: https://globus.org
Authenticate to Globus.
There are a few options available.
if your organisation is in the list, select it and click continue (this works for Monash and others.)
use your Google account
use your ORCiD iD
if you have a Globus ID, click on ‘Globus ID to sign in’
use your Google account (this works for Monash)
use your ORCiD iD
Choose the most appropriate option and follow the displayed pages.
After authenticating to Globus the File Manager will be presented.
Click on the icon as shown to ensure two panels are displayed.
The next step is to connect to the MASSIVE endpoint.
Click in the Collection text box. It does not matter which one, left or right.
Enter “MASSIVE M3” to search for the MASSIVE endpoint.
Select “MASSIVE M3” from the list.
You will be redirected to a page running at https://globusdev.massive.cloud.monash.edu/
This machine is running inside MASSIVE.
Use your MASSIVE username and password to ‘Sign In’.
After authenticating successfully, your MASSIVE home folder will be displayed.
Now you need to add another endpoint to transfer data in and out of MASSIVE.
Using the other ‘Collection’ field, enter the name of the required endpoint.
In the example shown below, ‘AARNet-Public-Test-Share’ has been used.
An Endpoint can be created on your personal computer. This allows you to transfer between your PC and any other endpoint, including MASSIVE.
To install, please choose the correct system and follow the Globus instructions.
Install, Configure, and Uninstall Globus Connect Personal for Mac OS X - https://docs.globus.org/how-to/globus-connect-personal-mac
Install, Configure, and Uninstall Globus Connect Personal for Windows - https://docs.globus.org/how-to/globus-connect-personal-windows
Install, Configure, and Uninstall Globus Connect Personal for Linux - https://docs.globus.org/how-to/globus-connect-personal-linux
Remembering the endpoint name setup during installation, search for your Endpoint in the Globus File Manager. The process is the same as steps 3 and 4 above.
For the Linux version, you may find the browsable directories are restricted for your Personal Globus Endpoint.
The file “~/.globusonline/lta/config-paths” can be altered to fix this.
For full details please read this link: https://docs.globus.org/faq/globus-connect-endpoints/#how_do_i_configure_accessible_directories_on_globus_connect_personal_for_linux
If your datasets contains symbolic links (or symlinks) please be aware of how Globus behaves with regards to these. Read this carefully: https://docs.globus.org/faq/transfer-sharing/#how_does_globus_handle_symlinks
This section assumes two endpoints have been connected to ‘File Manager’. If not, please refer to ‘Getting Started’ above.
In the below image, the right hand side is highlighted (dark blue menu), this means any actions will apply to files from the endpoint “AARNet-Public-Test-Share”.
possible file actions are highlighted in the middle of the screenshot.
to apply file actions to the endpoint MASSIVE, click on the left.
The Globus CLI software has been installed on MASSIVE.
module load globus-cli/2.0.0 globus
Usage: globus [OPTIONS] COMMAND [ARGS]... Interact with Globus from the command line All `globus` subcommands support `--help` documentation. Use `globus login` to get started! The documentation is also online at https://docs.globus.org/cli/ Options: -v, --verbose Control level of output -h, --help Show this message and exit. -F, --format [unix|json|text] Output format for stdout. Defaults to text --jmespath, --jq TEXT A JMESPath expression to apply to json output. Takes precedence over any specified ' --format' and forces the format to be json processed by this expression --map-http-status TEXT Map HTTP statuses to any of these exit codes: 0,1,50-99. e.g. "404=50,403=51" Commands: bookmark Manage endpoint bookmarks config Manage your Globus config file. (Advanced Users) delete Submit a delete task (asynchronous) endpoint Manage Globus endpoint definitions get-identities Lookup Globus Auth Identities list-commands List all CLI Commands login Log into Globus to get credentials for the Globus CLI logout Logout of the Globus CLI ls List endpoint directory contents mkdir Create a directory on an endpoint rename Rename a file or directory on an endpoint rm Delete a single path; wait for it to complete session Manage your CLI auth session task Manage asynchronous tasks transfer Submit a transfer task (asynchronous) update Update the Globus CLI to its latest version version Show the version and exit whoami Show the currently logged-in identity
The CLI allows users to access Globus data movement functionality via the command line. For more details please refer to: https://docs.globus.org/cli/
For Globus CLI examples: https://docs.globus.org/cli/examples/
This link contains a good list of examples. e.g. find an endpoint, file manipulation, single item transfers, batch transfers and deletions, share endpoints, safe resubmissions.
The Globus Platform enables developers to utilise Globus functionality within their own research data applications and services.
The notebook Platform_Introduction_Native_App_Auth.ipynb contains examples of using the functionality available.
Transfer_API_Exercises.ipynb has examples on transferring files, checking for errors, cancelling transfers under certain conditions, monitoring transfers.
Sample research data portals can be viewed here: https://docs.globus.org/modern-research-data-portal/ These illustrate what can be achieved by using the Globus API.
For further information on the functionality available through the Globus API please refer to: https://docs.globus.org/api/
rsync to synchronise file systems and to transfer large amounts of files,
with the ability to stop and restart the file transfers.
replicate all files in a folder from one spot to another. It first analyses
both file systems to find the difference and then transfers only the changes.
A typical command to synchronise files from a local folder to M3 is:
rsync -auv -e ssh adirectory firstname.lastname@example.org:~/destinationdirectory/
rsync is very powerful and has many options to help transfer data. For
example it can delete unwanted files (
--delete), compress data before
-z) or can you let you see what command options might do without
actually executing them (
--dry-run). For more info on