Orchestrating Alveo Compute Workloads with XRM

首页 > 文章 > Orchestrating Alveo Compute Workloads with XRM

Orchestrating Alveo Compute Workloads with XRM

Frédéric Rivoallon

Bin Tu 2020年05月29日

Overview

What if we could abstract the Alveo card resources? Query available compute resources and launch? This is what the Xilinx Resource Management (XRM) service allows! It’s a resource manager with a set of APIs built on top of Xilinx runtime. With XRM it’s possible to run multiple applications together while sharing multiple Alveo cards in a flexible arrangement.

Provisioning Compute Units from a Pool of Alveo Cards

With XRM, several xclbin are preloaded into different cards, then applications retrieve the hardware compute units their algorithm requires without worrying which card will service them.

Advantages of using XRM:

Abstracting the physical card from the application, the application only manages compute units. The XRM daemon takes care of finding an available compute unit resource within any of the pre-loaded cards
Allocating compute units with or without requesting exclusive access to a card. This finer grain access with control options guarantees a higher utilization for the underlying cards

Concurrent applications access accelerated compute dynamically

XRM assigns resources at the computes unit (CU) level, this higher resolution than the full xclbin file allows this card management system to better utilize the underlying resources working with finer grain elements. Indeed, a given xclbin programs one card, but it could contain several CUs.

XRM can access compute units of a given xclbin independently

Building, Installing XRM

XRM is available on Xilinx Github https://github.com/Xilinx/XRM and compatible with the v2020.1 release of Vitis.

The steps below describe how to build and install XRM on Linux (both CentOS and Ubuntu):

1: Download from the github repo

git clone https://github.com/Xilinx/XRM.git

2: Setup environment to build:

This following command assumes XRT – the Xilinx runtime – is already installed on the server.

source /opt/xilinx/xrt/setup.sh

3: Run "make" under the base directory of XRM

For Centos, the install requires a recent version of GCC, users will need to upgrade from their defaults and might want to use software collections to switch to a newer version:

scl enable devtoolset-(6|7) bash

Build the Linux package for the target flavor of Linux (CentOS or Ubuntu)

source /opt/xilinx/xrt/setup.sh

./build.sh

cd ./Debug

make package

4: Install package created in previous step

Ubuntu:

sudo apt install --reinstall ./Release/xrm_version.deb

CentOS :

sudo yum reinstall ./Release/xrm_version.rpm

Once done, the XRM files are installed in /opt/xilinx/xrm/

Launch the Daemon and Setup XRM Database

As discussed in the introduction, the XRM API commands allocate compute units (CUs) which are encapsulated into the xclbin binary files. In order to establish the correspondence between the CUs referred to in the API and the xclbin files, XRM needs to know about the assets available on the server: this is done through a local database managed by the xrmadm command:

1: Source setup

source /opt/xilinx/xrt/setup.sh

source /opt/xilinx/xrm/setup.sh

2: Start the xrmd daemon and optionally cleanup the database for a fresh start (root access required):

sudo /opt/xilinx/xrm/tools/start_xrmd.sh

To stop the daemon service:

sudo /opt/xilinx/xrm/tools/stop_xrmd.sh

Checking daemon status:

sudo systemctl status xrmd

3: Load xclbin or unload xclbin binaries specified in JSON file

The input is in JSON format, please refer to example under /opt/xilinx/xrm/test/ on how to specify the xclbin file and device for load/unload operation.

cd /opt/xilinx/xrm/test/

xrmadm list_cmd.json (To check the system result)

xrmadm load_devices_cmd.json (To load xclbin files to devices)

xrmadm list_cmd.json (To check the load result)

xrmadm unload_devices_cmd.json (To unload xclbin from devices)

xrmadm list_cmd.json (To check the unload result)

Here is an example of a JSON load input file for the xrmadm command:

    {
    "request": {
        "name": "load",
        "request_id": 1,
        "parameters": [
            {
            "device": 0,
            "xclbin": "/repo/xclbins/file.xclbin.xrm"
            },
            {
            "device": 1,
            "xclbin": "/repo/xclbins/file.xclbin.xrm"
            },
            {
            "device": 2,
            "xclbin": "/repo/xclbins/file.xclbin.xrm"
            },
            {
            "device": 3,
            "xclbin": "/repo/xclbins/file.xclbin.xrm"
            }
        ]
    }

The command would then create this style of outputs also in JSON format (see below). Each card information would be extracted from the xclbin metadata and the kernel and compute unit names would be listed:

    {
    "response": {
        "name": "list",
        "request_id": "1",
        "status": "ok",
        "data": {
            "device_number": "4",
            "device_0": {
                "cu_0": {
                },
                "cu_1": {
                }
            },
            "device_1": {
                "cu_0": {
                },
                "cu_1": {
                }
            },            
            "device_2": {
                "cu_0": {
                },
                "cu_1": {
                }
            },
            "device_3": {
                "cu_0": {
                },
                "cu_1": {
                }
            },
        }
    }
}

Check the system

To check the system configuration, the XRM github repository provides examples of tests executables to verify the CUs.

XRM API Commands for Application Developers

This section we present the main APIs available to developers to create XRM-compatible applications.

This list was composed based on a version that might have been superseded, check the repository for the current list of API functions.

Command	Description
xrmCreateContext()	Establishes a connection with the XRM daemon
xrmDestroyContext()	Disconnects an existing connection with the XRM daemon
xrmIsDaemonRunning()	check whether the daemon is running
xrmLoadOneDevice()	loads xclbin to one device
xrmUnloadOneDevice()	unloads xclbin from one device
xrmCuAlloc()	Allocates compute unit with a device, cu, and channel given a kernel name or alias or both and request load (1 - 100). This function also provides the xclbin and plugin to be loaded on the device.
xrmCuListAlloc()	xrm_cu_list_alloc() allocates a list of compute unit resource given a list of kernels's property with kernel name or alias or both and request load (1 - 100)
xrmCuRelease()	Releases a previously allocated resource
xrmCuListRelease()	Releases a previously allocated list of resources
xrmCuGetMaxCapacity()	Retrieves the maximum capacity associated with a resource
xrmCuCheckStatus()	Returns whether or not a specified cu resource is busy
xrmAllocationQuery()	Query the compute unit resource given the allocation service id.
xrmCheckCuAvailableNum()	Check the available cu list num on the system given a list of kernel property with name or alias or both and request load (1 - 100).
xrmCheckCuListAvailableNum	check the available cu list num on the system given a list of kernels's property with kernel name or alias or both and request
xrmCheckCuPoolAvailableNum	Check the available cu pool num on the system given a pool of kernel property with kernel name or alias or both and request
xrmCuPoolReserve	Reserves a pool of compute unit resource given a pool of kernel property with kernel name or alias or both and request load (1 - 100).
xrmCuPoolRelinquish	Relinquish a previously reserved pool of resources
xrmReservationQuery	query the compute unit resource given the reservation id.
xrmExecPluginFunc	execute the function of one specified plugin.

Conclusion

XRM allocates accelerated compute units across multiple Alveo cards. The applications leveraging XRM execute available compute unit resources on the server as soon as available. With XRM, the cards themselves are abstracted, they’re registered into a database that the XRM daemon accesses.

About Frédéric Rivoallon

Frédéric Rivoallon is a member of the software marketing team in San Jose, CA and is the product manager for Xilinx HLS, besides high-level synthesis Frédéric also has expertise in compute acceleration with Xilinx devices, RTL synthesis, and timing closure. Past experiences taught him video compression and board design.

About Bin Tu

Bin Tu received his bachelor’s degree in Computer Science at Peking University in 1997, and master’s degree in Computer Science at the Peking University in 2001. In 2001 he started as a Software Engineer at Sun Microsystems China R&D Center, then Oracle (Sun was acquired by Oracle) China/US R&D Center where he worked on Solaris Operating System device driver, network protocols, network virtualization, performance optimization, etc. Since 2019 he joined Xilinx, now working as a Senior Staff Engineer on Xilinx Cloud Deployment technology include Xilinx FPGA Resource Management, Kubernetes, Docker Container, Load Balancer etc.