Orchestrating Alveo Compute Workloads with XRM
Overview
What if we could abstract the Alveo card resources? Query available compute resources and launch? This is what the Xilinx Resource Management (XRM) service allows! It’s a resource manager with a set of APIs built on top of Xilinx runtime. With XRM it’s possible to run multiple applications together while sharing multiple Alveo cards in a flexible arrangement.
Provisioning Compute Units from a Pool of Alveo Cards
With XRM, several xclbin are preloaded into different cards, then applications retrieve the hardware compute units their algorithm requires without worrying which card will service them.
Advantages of using XRM:
- Abstracting the physical card from the application, the application only manages compute units. The XRM daemon takes care of finding an available compute unit resource within any of the pre-loaded cards
- Allocating compute units with or without requesting exclusive access to a card. This finer grain access with control options guarantees a higher utilization for the underlying cards
XRM assigns resources at the computes unit (CU) level, this higher resolution than the full xclbin file allows this card management system to better utilize the underlying resources working with finer grain elements. Indeed, a given xclbin programs one card, but it could contain several CUs.
Building, Installing XRM
XRM is available on Xilinx Github https://github.com/Xilinx/XRM and compatible with the v2020.1 release of Vitis.
The steps below describe how to build and install XRM on Linux (both CentOS and Ubuntu):
1: Download from the github repo
git clone https://github.com/Xilinx/XRM.git
2: Setup environment to build:
This following command assumes XRT – the Xilinx runtime – is already installed on the server.
source /opt/xilinx/xrt/setup.sh
3: Run "make" under the base directory of XRM
For Centos, the install requires a recent version of GCC, users will need to upgrade from their defaults and might want to use software collections to switch to a newer version:
scl enable devtoolset-(6|7) bash
Build the Linux package for the target flavor of Linux (CentOS or Ubuntu)
source /opt/xilinx/xrt/setup.sh
./build.sh
cd ./Debug
make package
4: Install package created in previous step
Ubuntu:
sudo apt install --reinstall ./Release/xrm_version.deb
CentOS :
sudo yum reinstall ./Release/xrm_version.rpm
Once done, the XRM files are installed in /opt/xilinx/xrm/
Launch the Daemon and Setup XRM Database
As discussed in the introduction, the XRM API commands allocate compute units (CUs) which are encapsulated into the xclbin binary files. In order to establish the correspondence between the CUs referred to in the API and the xclbin files, XRM needs to know about the assets available on the server: this is done through a local database managed by the xrmadm command:
1: Source setup
source /opt/xilinx/xrt/setup.sh
source /opt/xilinx/xrm/setup.sh
2: Start the xrmd daemon and optionally cleanup the database for a fresh start (root access required):
sudo /opt/xilinx/xrm/tools/start_xrmd.sh
To stop the daemon service:
sudo /opt/xilinx/xrm/tools/stop_xrmd.sh
Checking daemon status:
sudo systemctl status xrmd
3: Load xclbin or unload xclbin binaries specified in JSON file
The input is in JSON format, please refer to example under /opt/xilinx/xrm/test/ on how to specify the xclbin file and device for load/unload operation.
cd /opt/xilinx/xrm/test/
xrmadm list_cmd.json (To check the system result)
xrmadm load_devices_cmd.json (To load xclbin files to devices)
xrmadm list_cmd.json (To check the load result)
xrmadm unload_devices_cmd.json (To unload xclbin from devices)
xrmadm list_cmd.json (To check the unload result)
Here is an example of a JSON load input file for the xrmadm command:
{
"request": {
"name": "load",
"request_id": 1,
"parameters": [
{
"device": 0,
"xclbin": "/repo/xclbins/file.xclbin.xrm"
},
{
"device": 1,
"xclbin": "/repo/xclbins/file.xclbin.xrm"
},
{
"device": 2,
"xclbin": "/repo/xclbins/file.xclbin.xrm"
},
{
"device": 3,
"xclbin": "/repo/xclbins/file.xclbin.xrm"
}
]
}
The command would then create this style of outputs also in JSON format (see below). Each card information would be extracted from the xclbin metadata and the kernel and compute unit names would be listed:
{
"response": {
"name": "list",
"request_id": "1",
"status": "ok",
"data": {
"device_number": "4",
"device_0": {
"cu_0": {
},
"cu_1": {
}
},
"device_1": {
"cu_0": {
},
"cu_1": {
}
},
"device_2": {
"cu_0": {
},
"cu_1": {
}
},
"device_3": {
"cu_0": {
},
"cu_1": {
}
},
}
}
}
Check the system
To check the system configuration, the XRM github repository provides examples of tests executables to verify the CUs.
XRM API Commands for Application Developers
This section we present the main APIs available to developers to create XRM-compatible applications.
This list was composed based on a version that might have been superseded, check the repository for the current list of API functions.
Command | Description |
xrmCreateContext() |
Establishes a connection with the XRM daemon |
xrmDestroyContext() |
Disconnects an existing connection with the XRM daemon |
xrmIsDaemonRunning() |
check whether the daemon is running |
xrmLoadOneDevice() |
loads xclbin to one device |
xrmUnloadOneDevice() |
unloads xclbin from one device |
xrmCuAlloc() |
Allocates compute unit with a device, cu, and channel given a kernel name or alias or both and request load (1 - 100). This function also provides the xclbin and plugin to be loaded on the device. |
xrmCuListAlloc() |
xrm_cu_list_alloc() allocates a list of compute unit resource given a list of kernels's property with kernel name or alias or both and request load (1 - 100) |
xrmCuRelease() |
Releases a previously allocated resource |
xrmCuListRelease() |
Releases a previously allocated list of resources |
xrmCuGetMaxCapacity() |
Retrieves the maximum capacity associated with a resource
|
xrmCuCheckStatus() |
Returns whether or not a specified cu resource is busy |
xrmAllocationQuery() |
Query the compute unit resource given the allocation service id.
|
xrmCheckCuAvailableNum() |
Check the available cu list num on the system given a list of kernel property with name or alias or both and request load (1 - 100). |
xrmCheckCuListAvailableNum |
check the available cu list num on the system given a list of kernels's property with kernel name or alias or both and request
|
xrmCheckCuPoolAvailableNum |
Check the available cu pool num on the system given a pool of kernel property with kernel name or alias or both and request |
xrmCuPoolReserve |
Reserves a pool of compute unit resource given a pool of kernel property with kernel name or alias or both and request load (1 - 100). |
xrmCuPoolRelinquish |
Relinquish a previously reserved pool of resources |
xrmReservationQuery |
query the compute unit resource given the reservation id. |
xrmExecPluginFunc |
execute the function of one specified plugin. |
Conclusion
XRM allocates accelerated compute units across multiple Alveo cards. The applications leveraging XRM execute available compute unit resources on the server as soon as available. With XRM, the cards themselves are abstracted, they’re registered into a database that the XRM daemon accesses.
About Frédéric Rivoallon
Frédéric Rivoallon is a member of the software marketing team in San Jose, CA and is the product manager for Xilinx HLS, besides high-level synthesis Frédéric also has expertise in compute acceleration with Xilinx devices, RTL synthesis, and timing closure. Past experiences taught him video compression and board design.
About Bin Tu
Bin Tu received his bachelor’s degree in Computer Science at Peking University in 1997, and master’s degree in Computer Science at the Peking University in 2001. In 2001 he started as a Software Engineer at Sun Microsystems China R&D Center, then Oracle (Sun was acquired by Oracle) China/US R&D Center where he worked on Solaris Operating System device driver, network protocols, network virtualization, performance optimization, etc. Since 2019 he joined Xilinx, now working as a Senior Staff Engineer on Xilinx Cloud Deployment technology include Xilinx FPGA Resource Management, Kubernetes, Docker Container, Load Balancer etc.