User Policy
Under Revision
The user policy is currently undergoing some major updates, which will be announced once complete.
Version 7b (as of fall 2022)
1. Definitions
BGSC: Blugold Supercomputing Cluster. It is the cluster funded by the Blugold Differential Tuition in 2013, as well as nodes purchased by individual faculty members. The hardware feature does not allow for large scale parallelization, and it is thus mainly used for serial calculations and parallel jobs within a single node. It is also used for user training. Priority is given to course specific assignments and is best used for serial jobs.
BOSE: It is the cluster funded by an MRI grant in 2020, named after Indian theoretical physicist Satyendra Nath Bose. The hardware is designed to enable large scale parallelization. Unless otherwise noted, the same policies will be implemented on both clusters. Priority is given to research and highly parallel jobs.
BGSC.ADMINS: A group of faculty, staff, and students in charge of daily operations of both BGSC and BOSE. User services should be made by emailing BGSC.ADMINS@uwec.edu.
BGSC.ADVISORY: A group of UWEC faculty members who believe in the power of high-performance computing in undergraduate education and discovery. Working collaboratively, this group is responsible for the promotion, planning, developing, implementing, and maintaining high performance computing resources on campus.
2. User Accounts
2.1 UWEC Accounts
All UWEC faculty, staff, and students have access to BGSC and BOSE with a valid username and password. Upon first login, their accounts and home directories will be created. Their accounts will remain active, unless otherwise as discussed (see section 2.5). All accounts are assigned a 500 GB disk quota, that is not backed up, for their home directories by default. Request to increase disk quota shall be reviewed by BGSC.ADMINS and is granted only if it is absolutely needed.
2.2 Non-UWEC Clients
As part of the outreach efforts, faculty, staff, students from other UW campuses, can also access the clusters by contacting BGSC.ADMINS@uwec.edu and provide basic information regarding their expected usage of the cluster (expected core-hours, software requirement, duration of the project, etc.). BGSC.ADMINS shall review the request and, in most circumstances, approve the request and provide a UWEC login credential. Non-UWEC clients also come with a 500 GB disk quota.
HPC service is also available to non-UW system users and a service fee will be charged based on the core-hours used. The rate for the service fee shall be determined by BGSC.ADVISORY and shall be communicated to and agreed by the user before account creation. The service fee can be waived if;
- The non-UW user is working with a UWEC faculty member on a collaborative project involving UWEC students
- The non-UW user is participating HPC outreach activities offered through UWEC.
2.3 Mandatory User Training
All new users shall complete mandatory user training available through Canvas. A training partition is available on BGSC for all new users to test parallel efficiency and practice job submission. BGSC.ADMINS will help to determine the best submission parameters in the SLURM submitting script.
Initially, all new users are only allowed to submit to the training partition on BGSC. If requested, an online quiz will be given. If all questions are answered correctly, the user can request a meeting with BGSC.ADMINS to discuss the computational resources needed, based on which the proper hardware (BGSC or BOSE or both) and job priority will be assigned. If the new user is from an existing research group, then such a meeting can be replaced by an email request from the faculty advisor. In general, the same hardware and priority will be assigned to users from the same group.
User training should be performed for classroom activities as well. The instructor of the class should coordinate the user training and work with BGSC.ADMINS to determine the required resources, e.g., BGSC or BOSE, number of nodes, and expected core-hours. Students from the same class will be assigned the same hardware and priority.
2.4 User Groups
To facilitate sharing of data, user groups will be set up upon request. It is the faculty or staff advisor’s responsibility to contact BGSC.ADMINS to set up user groups for research or classroom projects. For data security, BGSC.ADMINS will not set up permission for an individual account unless requested by the user.
2.5 User Deletion
If a user remains inactive for 180 days, a reminder email will be sent to the user and the faculty advisor. After receiving the reminder email, the user shall log in to BGSC/BOSE within 30 days. Failure to do so will result in unrecoverable deletion of user account as well as all the user data. An extension of 30 days can be granted if a request is made to BGSC.ADMINS by a UWEC faculty.
User accounts and data can also be deleted at the request of the user or the faculty advisor.
2.6 Users’ Responsibilities
All BGSC users shall maintain good client practices and follow all policies while using high performance computing resources. In addition, all users are expected to:
- Acknowledge the support of BGSC (Blugold Differential Tuition) or BOSE (NSF MRI Award #: CNS-1920220) in publications that involve data generated from BGSC or BOSE. The following acknowledgement should be used “The computational resources of the study were provided by the Blugold Center for High-Performance Computing under NSF grant CNS-1920220”.
- Back up user data regularly. BGSC.ADMINS is not responsible for any data loss.
- Never run jobs on the head node. If reported, any running jobs will be killed immediately. Repeated violations may lead to loss of access to BGSC and BOSE.
- No PII (Personally Identifiable Information) data will be allowed on either cluster, https://www.dhs.gov/privacy-training/what-personally-identifiable-information
- No HIPPA (Health Insurance Portability and Accountability Act) data will be allowed on either cluster https://www.hhs.gov/hipaa/for-professionals/privacy/laws-regulations/index.html
2.7 Data Groups
UWEC faculty and staff members can request additional storage space for their research group on BOSE through BGSC.ADMINS. Based on the availability of resources, BGSC.ADMINS can approve requests for up to 2 TB of storage, which is shared by the whole group. Requests for storage beyond 2 TB shall be reviewed by BGSC.ADMINS and will be granted only if deemed necessary. Data stored in the group space will not be backed up and the faculty or staff member who requested the space shall clear up unused files on a regular basis.
3. Software Requests
BGSC Cluster and the BOSE Cluster are resources for research and teaching. As such, only software packages that serve research or teaching will be installed. To request the installation of a new software package, the user should:
- Email BGSC.ADMINS with the software name, version, and download link.
- Provide a brief description regarding the nature of the software and the project.
- If software is requested by a student, a UWEC faculty/staff member should be identified as the point of contact and should be copied on any email exchanges.
- For licensed software, proper license should be provided, and authorized users should be identified.
BGSC.ADMINS will review the request and, if deemed appropriate, request test data, install the software package, and set up access permission correspondingly.
4. Partitioning
The clusters will be partitioned to accommodate different types of jobs, including:
- GPU: partition that includes GPU nodes to allow for GPU calculations
- High memory: partition that includes high-memory nodes for jobs requiring large amount of memory
- CPU: partition including all the computational nodes. Owner-specific: partition that include nodes purchased by faculty/staff
- Software-specific: partition created to run specific software package due to license restriction. Only users with proper license have access to such partitions
- Pre: partition that is designed to maximize the cluster usage
- Reserved: partition created under the request of UWEC faculty/staff for classroom projects.
- Training: partition created for the purpose of new user training. Before successful completion of the mandatory training, new users only have access to this partition.
Note that partitions on BGSC and BOSE are different due to differences in hardware.
In general, nodes with similar hardware are grouped together to form a partition, for example, the GPU, CPU, reserve, and training.
Faculty and staff who purchased computational nodes through their grant will have priority access to those nodes for a duration of three years or the lifetime of the grant, whichever is longer. The partition name and job duration will be chosen by the faculty/staff. After expiration, the nodes will be repartitioned into one of the other partitions based on their hardware.
A preemptable partition that includes all the nodes on the cluster is designed to maximize the use of the cluster. However, jobs running on the preemptable partition have a lower priority and will be suspended and requeued if the hardware occupied by the job is requested through an owner-specific partition.
Nodes can be reserved at the request of UWEC faculty or staff members for classroom usage. The partition name and job duration will be determined on a case-by-case basis (see section 6 for policies regarding node reservation).
A list of partitions and the default settings are given below for BGSC cluster and BOSE cluster.
BGSC Cluster Partition Table
No maximal number of concurrent jobs
Partition Type | Partition Name | # Nodes | Time Limit | Purpose |
---|---|---|---|---|
GPU | GPU | 3 | 7 days | Partition that uses exclusively nodes that contain GPUs, only to be used when GPUs are required for your job |
CPU | week | 8 | 7 days | General use partition: should be used when jobs are expected to take less than a week, or the job can be restarted from a checkpoint. |
CPU | batch | 8 | 30 days | General use partition: should be used when jobs are expected to take up to a month to run. |
CPU | extended | 4 | 104 days | Special partition for jobs that will take up to 104 days and can’t be restarted. |
Owner-specific | (owner-specific) | 3 | unique | Nodes from individual grants, currently compute 71-73 |
Reserved | Reserved | varies | varies | Varies according to the users’- need |
Training | training | varies | 1 day | Created as needed |
Pre | scavenge | All | 24 hours | Low-priority partition only to be used for testing and debugging your software. Jobs that use this partition will run on any available node with space but will be requeued if a job comes in using another partition. |
Based on the cluster usage, max job duration may change. Such changes will be communicated to all users in advance.
BOSE Cluster Partition Table
Partition Type | Partition Name | # Nodes | Time Limit | Purpose |
---|---|---|---|---|
GPU | GPU | 4 | 7 days | Partition that uses exclusively nodes that contain GPUs, only to be used when GPUs are required for your job. Note you must specify number of GPU cards to use by setting #SBATCH --gpus=# in your .sh file |
CPU | week | 51 | 7 days | General use partition: should be used when jobs are expected to take less than a week, or the job can be restarted from a checkpoint. This partition has the most nodes and is highly recommended for most jobs. |
CPU | month | 9 | 30 days | Special partition for longer-length jobs. |
CPU | highmemory | 1 | 7 days | Partition for jobs that require larger amounts of memory to compute. Our single lm01 node has 1.95TB of memory. |
Software | magma | 10 | 7 days | Special partition for jobs that use the Magma software. |
Software | medea | 55 | 7 days | Specialty partition for jobs that use the MedeA software. |
Owner-specific | (owner-specific) | 0 | unique | Nodes that are purchased by users through external grants. |
Pre | pre | ALL | 24 hours | Low-priority partition only to be used for testing and debugging your software. Jobs that use this partition will run on any available node with space but will be requeued if a job comes in using another partition. |
5. Fair use policy
Both BGSC and BOSE are available to use, free of charge, for all faculty, staff, and students from all schools of the UW system. All users’ jobs are processed on a first-come first-served basis. In addition, a fair use policy, described below, applies to all jobs submitted on BOSE. No such policy exists on BGSC at present but may be implemented if usage warrants.
Each user can use a maximum of 50% of the available resources of the cluster (30 CPU nodes and 6 GPU cards) at any given time. Additionally, to maximize efficiency, each user is limited to 10 concurrent, 15 submitted jobs.
To accommodate the need of heavy usage due to ongoing internally or externally funded research projects, as well as to encourage faculty involvement in HPC related outreach activities, principal investigators (PI) and their research students can obtain an increased limit of 30 concurrent, 45 submitted jobs, if:
- UW PI contributed at least $500 in supply budget through internal or external grant in the past one year.
- UW PI made significant contributions to the growth of HPC resources in the past one year. Such contributions include, but are not limited to, securing funding for HPC hardware, participating in HPC management, organizing and/or serving HPC outreach activities.
- Non-UW users whopay a service fee (to be determined).
Job priority will be determined based on considerations including user’s job history and resource requirements. In general, job priority:
- Decreases as the total number of jobs in the past 14 days.
- Increases with job wait time.
- Increases with job size, in cores, because it is more difficulty for larger jobs to find the needed resources.
The cluster administrators reserve the right to change the above-described limits and priority settings based on the usage of the cluster. Such changes are discussed in BGSC.ADVISORY meeting prior to implementation. Any such changes will be communicated to the users in advance.
Users who purchase their own nodes (owner-specific partitions) will have priority access to their own nodes for a duration of 3 years or the length of their grant, whichever is longer. The 50% resource limit discussed above (30 CPU nodes and 6 GPU cards) does not include nodes in the owner-specific partition. Jobs running on and submitted to the owner-specific partition do not count toward the running job and submitted job limits described above. The priority settings on these owner-specific partitions are as follows:
- All users can access owner-specific nodes by submitting to the “pre” partition assuming the nodes are idle.
- If the faculty owners (and their students from the same group) submit jobs to the same node through their own partition where others are running, other jobs will be suspended and requeued.
6. Reservation
UWEC faculty and staff members can request nodes for classroom usage. The request shall be made at least two weeks in advance to BGSC.ADMINS with a detailed description including the nature of the classroom usage, number of students in the class, number of nodes needed, and time frame of the reservation. In normal circumstances, BGSC will be used for this purpose. If BOSE is desired, justifications should be provided to BGSC.ADMINS. Upon request, BGSC.ADMINS will determine the best approach for the reservation that minimizes the impact on existing jobs. Requests on short notice are possible, although a successful reservation cannot be guaranteed. In general, BGSC.ADMINS can approve requests asking for a maximum of ten nodes on BGSC or five nodes on BOSE, and for a maximal duration of 4 hours per reservation. Reservation of more nodes and longer time need approval from BGSC.ADMINS, which meets weekly. If approved, the computational resources on the reserved nodes will only be available to the group of users identified by the requestor, and the same fair use policy described in section 5 applies to these users. Announcement regarding the reservation shall be made to all the users as soon as possible. In general, BGSC or BOSE shall not be reserved for research projects.
Last Updated: 9/15/2021