Nvidia Multi-Instance GPU (MIG)

The new 'Multi-Instance GPU (MIG)' feature allows GPUs to be partitioned into up to seven separate GPU instances for CUDA applications.

Supported GPUs

  • H100-SXM5
  • H100-PCIE
  • A100-SXM4 (40 and 80 GB)
  • A100-PCIE (40 and 80 GB)
  • A30

Enable MIG Mode

By default, MIG mode is not enabled on the GPU, so you need to enable it:

$ nvidia-smi -i <GPU IDs> -mig 1

The GPUs can be selected using comma separated GPU indexes. If no GPU ID is specified, then MIG mode is applied to all the GPUs on the system.

List GPU Instance Profiles

Once the MIG mode is activated, the NVIDIA driver provides a number of profiles that users can opt-in for when configuring the MIG feature in A100.

$ sudo nvidia-smi mig -lgip

Creating GPU Instances

Before starting to use MIG, the user needs to create GPU instances using the -cgi option. Once the GPU instances are created, one needs to create the corresponding Compute Instance (CI). By using the -C option, nvidia-smi creates these instances.

$ sudo nvidia-smi mig -cgi <profile_ID> -C

Now list the available GPU instances:

$ sudo nvidia-smi mig -lgi

and verify that the GIs and corresponding CIs are created:

$ nvidia-smi

Destroying GPU Instances

Once the GPU is in MIG mode, GIs and CIs can be configured dynamically. If the intention is to destroy all the CIs and GIs (recommended), then this can be accomplished with the following commands:

$ sudo nvidia-smi mig -dci && sudo nvidia-smi mig -dgi

If the output of this command says that it is impossible to destroy the instance because the GPU is occupied by a process, delete the process or restart the system with:

$ sudo reboot

Further reading

  1. Nvidia Data Center Documentation: https://docs.nvidia.com/datacenter/tesla/mig-user-guide/#lgi