# Private Partitions

Private partitions are lab-specific Slurm partitions associated with dedicated MSI compute hardware. They are useful when a group has purchased dedicated compute and needs a queue that is only available to an approved set of users.

If your lab is considering dedicated hardware but does not have it yet, see MSI's [Dedicated Computing](https://msi.umn.edu/computing/dedicated-computing) page.

## Access List

Most MSI users do not have a private partition by default. If your lab purchased dedicated compute, your PI or group administrator should be able to tell you:

- Partition name
- Which groups of users have access, by default the whole project has access

If a partition has been configured for group access, it will also appear to eligible users on the Open OnDemand [dedicated compute](https://ondemand.msi.umn.edu/pun/sys/dashboard/custom/dedicated_compute) page. This is often the easiest way to confirm that a private partition exists and to identify its name.

If you are not sure, start by asking your PI, group administrator, or the person in your lab who manages MSI access.

Another method is to inspect the UNIX group memberships for users in your lab or project group. This can help identify dedicated-access groups even when no recent jobs have run on the partition.

`getent group <group name> | cut -d ':' -f 4 | while read -r member; do groups ${member//,/ }; done`

For example:

```bash
08:52:54 [vega0051@vortex ~ ]$ getent group wma | cut -d ':' -f 4 | while read -r member; do groups ${member//,/ }; done
wma : wma wma-dedicated
ekl : ekl wma agate_a100_long
chen9462 : wma wma-dedicated
escud012 : wma wma-dedicated
wan01975 : wma
```

This approach can be useful because it does not depend on whether anyone in the group has used the partition within the last day.
Then inspect the partition definition:

```bash
scontrol -a show partition <partition name>
[vega0051@ahl04 ~ ]$ scontrol show partition <partition name> --json | jq '.partitions[0] | { name,nodes,groups }'
{
  "name": "<partition name>",
  "nodes": {
    "allowed_allocation": "",
    "configured": "<node list>",
    "total": <number of nodes>
  },
  "groups": {
    "allowed": "<group(s) that have access>"
  }
}
```

### How to Request Access

Some private partitions are configured so that anyone in the PI group can use them. Others use a separate allow list for individual users.

If your lab uses a partition user list:

1. Contact your PI or lab administrator and confirm that you should be added.
2. Ask them to email `help@msi.umn.edu` if the partition access list needs to be updated.
3. Be ready to provide your Internet ID and the partition name.

If you are a new MSI user, make sure you have already accepted the [User Agreement](https://mymsi.msi.umn.edu/user/user-agreement/) and completed the normal onboarding steps described in {doc}`../new_users/mymsi`.

## Scheduling Jobs

### Finding Partition Limits and Features

Private partitions have their own scheduling limits, just like other Slurm partitions. Check the partition configuration directly when you need to confirm values such as maximum walltime, node count, or default settings.

```bash
# Show the partition configuration
scontrol -a show partition <partition name>

# Show a compact summary for the partition
sinfo -p <partition name>
```

These commands are useful for checking:

- node names in the partition
- The walltime limit shown by Slurm
- The current node state such as `idle`, `mix`, `alloc`, or `down`

For example:

```bash
sinfo -p <partition name>
```

```text
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
sioux        up 37-00:00:0      1   comp acn166
sioux        up 37-00:00:0      1    mix acn116
```

Dedicated compute service is made available through the same MSI scheduler used by other compute resources. Depending on the configuration, private partitions may allow much longer runtimes than standard shared partitions, such as a maximum walltime of 37 days.

Private partitions are also subject to MSI maintenance reservations. Jobs will only start if they can finish before the next reservation. You can view upcoming reservations with:

```bash
scontrol show reservation
```

### Specifying the Partition in a Job Script or Request

Include the partition name in your Slurm script:

```bash
#!/bin/bash -l
#SBATCH --job-name=my-private-job
#SBATCH --partition=mypartition
#SBATCH --time=08:00:00
#SBATCH --nodes=1
#SBATCH --ntasks=1

./my_program
```

You can also provide the partition on the command line:

```bash
sbatch -p mypartition job.sh
```

If the partition includes GPUs or other specialized hardware, request those resources explicitly in the same way you would on shared partitions. For example:

```bash
#SBATCH --partition=mygpu
#SBATCH --gres=gpu:1
```

### Shared Use Considerations

Even though the hardware is dedicated to your lab, the resources are still finite.

- Coordinate with lab members when multiple people share the same node or partition.
- Request only the CPU, memory, time, and GPU count you actually need.
- Long jobs, large jobs, or jobs that request all GPUs on a node can block other lab members from starting work.
- If the partition is small, a single fully allocated node may mean no one else in the lab can run until that job finishes.

The same right-sizing advice in {doc}`Job Sizing and Efficency <job_sizing>` applies to private partitions.

## Additional features

These are the **optional** features that can be enabled on dedicated compute partitions:

- **Extended walltime**: private partitions can be configured with much longer maximum runtimes than standard shared partitions, such as up to 37 days.
- **Slurm bypass**: some configurations may allow direct access outside normal job submission. If enabled, labs should use this carefully because work started outside Slurm is not coordinated by the scheduler and can interfere with other users on the same dedicated resource.
- **Shared scratch**: a partition can be configured with shared scratch behavior for lab workflows that need temporary working space across users on the dedicated resource. Treat scratch as temporary space rather than long-term storage.
- **`nohidepid`**: process visibility can be configured so users on the partition can inspect other users' processes with tools such as `top` and `htop`.

For service details and purchasing information, see the MSI [Dedicated Computing](https://msi.umn.edu/computing/dedicated-computing) page.

## Troubleshooting

### Check How Busy the Partition Is

From Open OnDemand, use the [dedicated compute page](https://ondemand.msi.umn.edu/pun/sys/dashboard/custom/dedicated_compute) to review your jobs, assigned nodes, and node states. This is often the quickest way to confirm whether your private partition is already in use and covers much of the same information users commonly check with `squeue`, `sinfo`, and sometimes `scontrol`.

From the command line, start with:

```bash
# Show jobs in the partition
squeue -p mypartition

# Show node state for the partition
sinfo -N -p mypartition
```

If you want more detail for a specific node:

```bash
scontrol -a show node <node name>
```

Useful node states include:

- `idle`: no jobs are using the node
- `mix`: the node is in use, but some resources are still free
- `alloc`: the node is fully allocated
- `comp`: the node is completing one or more jobs
- `drain`: the node is not accepting new work
- `down`: the node is unavailable

### Understanding Job States

Common Slurm job states such as `PD` (pending), `R` (running), and `CG` (completing) are described on the {doc}`Slurm Job Submission and Scheduling <slurm_job_submission>` page.

If a job is pending, check the `NODELIST(REASON)` column in `squeue` to see whether it is waiting on resources, limits, or reservations.

### What to Do When the Node Is Unavailable

If your private partition appears unavailable, the next step depends on the node state:

- `alloc`, node is fully assigned to one or more jobs. Wait for resources to free up or coordinate with your lab members.
- `mix`, part of the node may still be available, but not enough for your request.
- `drain` or `down` , if a node enters this state, it is automatically flagged for reboot/servicing. Contact our help desk if you have further questions.
- If jobs are not starting because of an upcoming maintenance reservation, shorten the requested walltime or wait until the reservation passes.