Get started with P5 instances - Amazon Elastic Compute Cloud

Get started with P5 instances

P5 instances provide 8 NVIDIA H100 GPUs with 640 GB of high-bandwidth GPU memory. They feature 3rd generation AMD EPYC processors and provide 2 TB of system memory, 30 TB of local NVMe instance storage, 3,200 Gbps aggregated network bandwidth, and GPUDirect RDMA support. P5 instances also support Amazon EC2 UltraCluster technology, which provides lower latency and improved network performance using EFA.

The following table provides a summary of the p5.48xlarge specifications.

vCPUs System memory GPUs GPU memory Network bandwidth GPUDirect RDMA GPU peer to peer Instance storage
192 2 TiB 8 NVIDIA H100 GPUs 640 GB HBM3 3200 Gbps with EFAv2 Supported 900 GB/s NVSwitch 8 x 3,800 GB NVMe SSD volumes
Software configuration

The easiest way to get started with P5 instances is to launch an instance using an AWS Deep Learning AMI that is preconfigured with all of the required software. For the latest AWS Deep Learning AMI for use with P5 instances, see the AWS Deep Learning Base GPU AMI (Ubuntu 20.04).

If you need to build a custom AMI for use with P5 instances, we recommend installing the following minimum software versions:

  • NVIDIA driver 535.54.03 or later

  • CUDA 12.1 or later

  • NVIDIA GDRCopy 2.3 or later

  • EFA installer 1.24.1 or later

  • NCCL 2.18.3 or later

  • aws-ofi-nccl plugin 1.7.2-aws or later

We also recommend that you configure the instance to not use deeper C-states. For more information, see High performance and low latency by limiting deeper C-states. The latest AWS Deep Learning Base GPU AMI is preconfigured to not use deeper C-states.

Ubuntu 20.04 specific recommendations

The following recommendations for Ubuntu 20.04 help prevent unpredictable interface naming on boot:

  • Ensure you are running systemd 245.4-4ubuntu3.19 or later with the following command:

    systemd --version
  • Ensure you have configured GRUB:

    • Open the /etc/default/grub configuration file in a text editor.

    • Edit the GRUB_CMDLINE_LINUX_DEFAULT entry to include net.naming-scheme=v247.

    • Reboot your instance by running sudo update-grub.

Networking and EFA configuration

P5 instances deliver 3200 Gbps of networking bandwidth by using multiple EFA interfaces. P5 instances support 32 network cards. We recommend that you define a single EFA network interface per network card. To configure these interfaces at launch we recommend the following settings:

  • For network interface 0, specify device index 0

  • For network interfaces 1 through 31, specify device index 1

For more information about how to configure your P5 instances for EFA see Get started with P5 instances and EFA.