vGPU Capabilities

Not finding what you're looking for?
Contact Us

vGPU Capabilities

Topic: Arcus

Overview

A GPU is a Graphical Processing Unit. It allows computers to better handle graphics-intensive applications, such as web design, video editing, and 3D rendering. A virtual GPU – a vGPU – brings the same functionality to a virtualized system with much greater flexibility in overall capacity and resource availability.

Arcus has implemented virtual Graphics Processing Unit (vGPU) capabilities to support those users who require additional resources for graphical- or computational-intensive applications.

Capacity & Availability

If your organization has purchased vGPU capabilities via Arcus, the ability to attach a vGPU to a specific Arcus system is self-service! Please keep in mind the number of supported vGPU instances at a given time and ensure that any users deploying vGPU-enabled systems understand the available GPU types and drivers available in their selected cloud.

Managing vGPU Resources

Arcus Team Managers can view the total number of vGPUs available to their team. If additional resources are required beyond what is currently visible, please contact support.

Arcus Project Owners can view the total number of vGPUs available to their team, and can also set Project specific limits under Manage and the Project Limits header as shown.

Project GPU Limit Setting

Like other Arcus Project-specific limits, these limits are most useful when there are multiple projects in a single team, and the Team Manager or Project Owner wants to prevent a single project from consuming the entire team’s resources.

Project Owners can also check the “unlimited” checkbox next to each GPU type to allow unrestricted use of team resources.

Setting up a vGPU-enabled System

Any Arcus user can build a vGPU-enabled system in Arcus by checking the Requires vGPU box under the Additional Capabilities header during initial system creation. Checking this box will present the user with a dropdown menu from which they can select the GPU type.

If no selection is made, Arcus will utilize any available GPU type to satisfy the request

System GPU toggle

For more information on general Arcus system creation, click here for further instructions.

Assets

You’ll need to install NVIDIA GPU Drivers for Linux and NVIDIA GPU Drivers for Windows. This asset should work for all Cloud types (VCloud, AWS, Azure).

Linux: https://app.arcus-cloud.io/#/software/6399/overview
Windows: https://app.arcus-cloud.io/#/software/1710/overview

See “Asset help” for additional information.

Driver Re-Install Script

A Copy of the installer script is placed in C:\cons3rt on Windows machines and /opt/ on linux machines. To do a re-install of drivers you can use this script for convenience. This may be necessary if the drivers require updates or you find you are having licensing issues with with your vGPU drivers. To run the scripts you will want to be in an Administrator’s PowerShell session on your Windows machine, or in the console running as either root or a sudo privileged user. The commands for each would be as follows:

For Linux:

Make sure there is not already a NVIDIA-Linux-x86_64-latest-grid.run file in the /opt/ folder and then run the install_drivers.sh script. This should be done as either the root user or as a user who can run the script with sudo privileges.

root

user

Note: It is possible the driver fails to install if you are running this script to re-install drivers because the drivers are currently in use. The error may manifest as something like this:

ERROR: An NVIDIA kernel module ’nvidia’ appears to already be loaded in your kernel. This may be because it is in use (for example, by an X server, a CUDA program, or the NVIDIA Persistence Daemon), but this may also happen if your kernel was configured without support for module unloading. Please be sure to exit any programs that may be using the GPU(s) before attempting to upgrade your driver. If no GPU-based programs are running, you know that your kernel supports module unloading, and you still receive this message, then an error may have occured that has corrupted an NVIDIA kernel module’s usage count, for which the simplest remedy is to reboot your computer.

You can then follow these instructions to make sure the gdm service is stopped before attempting to re-run the script.

For Windows:

Navigate to the C:\cons3rt folder in an Administrative Powershell session and run the script. Make sure to set the execution policy to bypass in order to execute the .ps1 file.

powershell

Remember a reboot will be necessary for the vGPU Drivers to be fully installed and functional on the system in question after the script is ran.

Launching a vGPU-enabled Run

If a user Enables vGPU at the system level as described above, when the finished Arcus system (via quickbuild), or the completed deployment based on that system design is launched, the user will be prompted to complete the vGPU setup as part of the launch process.

At the Configure Resources step in the Arcus launch process, the user will be presented with a dropdown menu to select the GPU profile available in the selected cloudspace.

GPU Type Dropdown

If you are unsure which GPU type to select, be sure to check the vGpu Types Available in this KB. Alternatively, consult the NVIDIA GRID official documentation here to confirm the vGPU is satisfying any requirements for the desired deployment or application.

GPU Type Dropdown

Once that selection is complete, click Next and complete the remainder of the Arcus launch process.

Please Note: The user must select a GPU profile if launching into an H2 cloudspace. There is no additional selection required when launching into an AWS or Azure cloudspace.

Exceeding Current vGPU Capacity

In the event that a user tries to deploy a Arcus System with vGPU Enabled and selects a GPU type that exceeds current assigned GPU capacity, the Deployment Run will remain in the Scheduled state until it is manually cancelled, the additional capacity is added, or an existing Deployment Run with the same assigned vGPU type is released and frees up the requested vGPU type.

GPU Exceeded

Please Note: A scheduled Deployment Run does not consume resources in this state, but it is not advised to leave Deployment Runs in this state for extended periods of time.

vGPU Types Available

On Premise Cloud

GPU Mfg	Model	Use Case
NVIDIA	M10	Graphic Focused
NVIDIA	V100	Compute Focused
NVIDIA	RTX8000	Compute Focused, Graphic Focused

AWS Clouds

For additional information on AWS Commercial and AWS Govcloud vGPU types and availability, please see official AWS documentation here.

Azure Clouds

For additional information on Azure Commercial and Azure Gov vGPU types and availability, please see official Azure documentation here.

Configuration

All hypervisors with vGPU capabilities are configured with the NVIDIA drivers. The operating system templates in the vGPU-enabled cloudspaces have had the appropriate video driver installed and tested for vSGA access (e.g., VMware SVGA 3D).

Please be aware that additional drivers may need to be installed or updated depending on your specific use case; and also keep in mind driver version dependencies when researching your intended application.

Basic Troubleshooting

Ensure the application supports the correct versions of DirectX and OpenGL
Validating browser support for OpenGL: http://get.webgl.org
Validating performance of OpenGL & vGPU on a system: http://madebyevan.com/webgl-water/.

The Arcus admin team can validate deployment runs and video memory usage in support of troubleshooting. However, at this time, they must be on keyboard while the test is underway, so access to this information is reserved for coordinated troubleshooting events with the user.

More Help

Review this topic with our video tutorials: