1. Introduction

This document provides a brief summary of information that you'll need to know to quickly get started working on Onyx. For more detailed information, see the Onyx User Guide.

2. Get a Kerberos Ticket

For security purposes, you must have a current Kerberos ticket on your computer before attempting to connect to Onyx. A Kerberos client kit must be installed on your desktop to enable you to get a Kerberos ticket. Information about installing Kerberos clients on your desktop can be found at HPC Centers: Kerberos & Authentication.

3. Connect to Onyx

Onyx can be accessed via Kerberized ssh as follows:

% ssh user@onyx.erdc.hpc.mil

4. Home, Working, and Center-wide Directories

Each user has file space in the $HOME, $WORKDIR, and $CENTER directories. The $HOME, $WORKDIR, and $CENTER environment variables are predefined for you and point to the appropriate locations in the file systems. You are strongly encouraged to use these variables in your scripts.

NOTE: $WORKDIR is a "scratch" file system, and $CENTER is a center-wide file system that is accessible to all center production machines. The $WORKDIR file system is not backed up. You are responsible for managing files in your $WORKDIR directories by backing up files to the archive system and deleting unneeded files. Currently, $WORKDIR files that have not been accessed in 30 days and $CENTER files that have not been accessed in 120 days are subject to being purged.

If it is determined as part of the normal purge cycle that files in your $WORKDIR directory must be deleted, you WILL NOT be notified prior to deletion. You are responsible to monitor your workspace to prevent data loss.

5. Transfer Files and Data to Onyx

File transfers to DSRC systems must be performed using Kerberized versions of the following tools: scp, ftp, sftp, and mpscp. For example, the command below uses secure copy (scp) to copy a local file into a destination directory on an Onyx login node.

% scp local_file user@onyx.erdc.hpc.mil:/target_dir

For additional information on file transfers to and from Onyx, see the File Transfers section of the Onyx User Guide.

6. Submit Jobs to the Batch Queue

The Portable Batch System (PBS Professional ™) is the workload management system for Onyx. To submit a batch job, use the following command:

qsub [ options ] my_job_script

where my_job_script is the name of the file containing your batch script. For more information on using PBS or on job scripts, see the Onyx User Guide, the Onyx PBS Guide, or the sample script examples found in the $SAMPLES_HOME directory on Onyx.

7. Batch Queues

The table below describes the PBS queues available on Onyx. Jobs with high, frontier, and standard priority are handled differently depending on the requested walltime and core count.

Users should submit directly to high, frontier, or standard, which are routing queues. Jobs will be moved automatically into the appropriate large job "_lg", small job "_sm", or long walltime "_lw" queues.

Job priority starts at an initial value based on core count and the queue to which the job was submitted. It then increases for each hour that the job has been waiting to run.

Queue Descriptions and Limits on Onyx
Priority Queue Name Max Wall Clock Time Max Jobs Min Cores Per Job Max Cores Per Job Description
Highest urgent 24 Hours N/A 22 7,260 Designated urgent jobs by DoD HPCMP
Down arrow for decreasing priority debug 1 Hour 4 22 11,484 User testing
HIE 24 Hours 2 22 110 Rapid response for interactive work
frontier_lg 24 Hours 2 7,261 143,968 Frontier projects only (large jobs)
frontier_lw 168 Hours 15 22 15,708 Frontier projects only (long walltime)
frontier_sm 48 Hours 70 22 7,260 Frontier projects only (small jobs)
high_lg 24 Hours 2 8,449 105,820 Designated high-priority jobs by Service/Agency (large jobs)
high_lw 168 Hours 15 22 10,824 Designated high-priority jobs by Service/Agency (long walltime)
high_sm 24 Hours 70 22 8,448 Designated high-priority jobs by Service/Agency (small jobs)
frontier_md 96 Hours 2 15,709 34,540 Frontier projects only (medium sized, long walltime)
standard_lg 24 Hours 2 7,261 105,820 Normal priority jobs (large jobs)
standard_lw 168 Hours 3 22 5,808 Normal priority jobs (long walltime)
standard_sm 24 Hours 70 22 7,260 Normal priority jobs (small jobs)
transfer 48 Hours 6 1 1 Data transfer jobs. Access to the long-term storage
Lowest background 4 Hours 6 22 7,260 Unrestricted access - no allocation charge

8. Monitoring Your Job

You can monitor your batch jobs on Onyx using the qview, qstat, or qpeek commands.

The qview command lists all jobs in the queue. The "-u username" option shows only jobs owned by the given user, as follows:

% qview -u smith

                                        Num    Req'd      Elap
JobID    UserID  Queue      Job Name   Procs    Time      Time   ST ExHost
132277   smith   standard   job1        2520  96:00:00  66:35:00  R r8i0n0
133493   smith   standard   job2        2520  96:00:00  23:55:52  R r8i2n0
133503   smith   standard   job3        2520  96:00:00  23:43:50  R r18i0n0
133594   smith   standard   job4        2520  96:00:00 ---------  Q --------
133809   smith   standard   job5        2520  96:00:00 ---------  Q --------

Notice that the output contains the JobID for each job. This ID can be used with the qpeek, qview, qstat, and qdel commands.

To delete a job, use the command "qdel jobID".

To view a partially completed output file, use the "qpeek jobID" command.

9. Archiving Your Work

When your job is finished, you should archive any important data to prevent automatic deletion by the purge scripts.

Copy one or more files to the archive system
archive put [-C path ] [-D] [-s] file1 [file2 ...]

Copy one or more files from the archive system
archive get [-C path ] [-s] file1 [file2 ...]

For more information on archiving your files, see the Archive Guide.

10. Modules

Software modules are a convenient way to set needed environment variables and include necessary directories in your path so that commands for particular applications can be found. Onyx also uses modules to initialize your environment with application software, system commands, libraries, and compiler suites.

A number of modules are loaded automatically as soon as you log in. To see the modules that are currently loaded, use the "module list" command. To see the entire list of available modules, use the "module avail" command. You can modify the configuration of your environment by loading and unloading modules. For complete information on how to do this, see the Modules User Guide.

11. Available Software

A list of software on Onyx is available on the software page.

12. Advance Reservation Service

A subset of Onyx's nodes has been set aside for use as part of the Advance Reservation Service (ARS). The ARS allows users to reserve a user-designated number of nodes for a specified number of hours starting at a specific date/time. This service enables users to execute interactive or other time-critical jobs within the batch system environment. The ARS is accessible via most modern web browsers at https://reservation.hpc.mil. Authenticated access is required. The ARS User Guide is available on HPC Centers.