MIG
Like MPS, NVIDIA Multi-Instance GPU (MIG) facilitates the sharing of each GPU, but with strict partitioning of resources. It is ideal for sharing a GPU across multiple different users, as each MIG instance has a guaranteed set of resources and is fully isolated.
MIG is available on selected NVIDIA Ampere Architecture GPUs, including A100, which supports a maximum of seven MIG instances per GPU. MIG can be combined with MPS, where multiple MPS clients can run simultaneously on each MIG instance, up to a maximum of 48 total MPS clients per physical GPU.
While it is possible to run a single application instance across multiple MIG instances, such as with MPI, MIG does not aim to offer any performance improvements for this use case. The primary aim of MIG is to facilitate multiple distinct application instances on each GPU.
Figure 1 shows the effect of the number of simulations per GPU on the total combined throughput (in ns/day, higher is better) of all simulations running simultaneously across the 8-GPU DGX A100 server, for RNAse (left) and ADH (right). The leftmost result for each case (one simulation per GPU) is only a few percent lower than the corresponding isolated single-simulation result multiplied by 8 (the number of GPUs on the DGX A100 server). This shows that you can effectively run multiple simulations across the server without significant interference.
