Utilities¶
Device detection and capability checking utilities.
Overview¶
These utilities help you write portable code that adapts to available hardware.
from rotalabs_accel import (
get_device,
is_cuda_available,
is_triton_available,
get_device_properties,
)
# Auto-select best device
device = get_device() # Returns 'cuda' if available, else 'cpu'
# Check capabilities
print(f"CUDA available: {is_cuda_available()}")
print(f"Triton available: {is_triton_available()}")
# Get detailed GPU info
if is_cuda_available():
props = get_device_properties()
print(f"GPU: {props['name']}")
print(f"VRAM: {props['total_memory'] / 1e9:.1f} GB")
API Reference¶
Functions¶
get_device ¶
Get a torch device, with smart defaults.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
device
|
Optional[str]
|
Device string ('cuda', 'cpu', 'cuda:0', etc.). If None, returns CUDA if available, else CPU. |
None
|
Returns:
| Type | Description |
|---|---|
device
|
torch.device instance. |
Example
device = get_device() # Auto-detect device = get_device('cuda:1') # Specific GPU
Source code in src/rotalabs_accel/utils/device.py
is_cuda_available ¶
is_triton_available ¶
get_device_properties ¶
Get device properties and capabilities.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
device
|
Optional[device]
|
Device to query. If None, uses current CUDA device. |
None
|
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
Dictionary with device properties: |
Dict[str, Any]
|
|
Dict[str, Any]
|
|
Dict[str, Any]
|
|
Dict[str, Any]
|
|
Dict[str, Any]
|
|
Dict[str, Any]
|
|
Dict[str, Any]
|
|
Example
props = get_device_properties() print(f"GPU: {props['name']}") if props['supports_fp8']: ... print("FP8 quantization available!")
Source code in src/rotalabs_accel/utils/device.py
select_dtype ¶
Select the best available dtype for the device.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
preferred
|
dtype
|
Preferred dtype if supported. |
float16
|
device
|
Optional[device]
|
Device to check capabilities for. |
None
|
Returns:
| Type | Description |
|---|---|
dtype
|
Best supported dtype. |
Example
dtype = select_dtype(torch.bfloat16) model = model.to(dtype)
Source code in src/rotalabs_accel/utils/device.py
Usage Patterns¶
Portable Device Selection¶
from rotalabs_accel import get_device
device = get_device()
# Works on any platform
model = Model().to(device)
x = torch.randn(1, 512, 4096, device=device)
y = model(x)
Conditional Logic Based on Capabilities¶
from rotalabs_accel import is_triton_available, get_device_properties
if is_triton_available():
print("Using Triton-optimized kernels")
else:
print("Falling back to PyTorch")
# Select dtype based on GPU capabilities
if is_cuda_available():
props = get_device_properties()
if props.get('supports_bf16', False):
dtype = torch.bfloat16
print("Using BF16 (Ampere+)")
else:
dtype = torch.float16
print("Using FP16")
else:
dtype = torch.float32
print("Using FP32 on CPU")
Multi-GPU Selection¶
from rotalabs_accel import get_device
# Select specific GPU
device = get_device("cuda:0")
device = get_device("cuda:1")
# Force CPU even if GPU available
device = get_device("cpu")
Device Properties¶
The get_device_properties() function returns a dictionary with:
| Property | Type | Description |
|---|---|---|
name |
str | GPU name (e.g., "NVIDIA A100-SXM4-80GB") |
compute_capability |
tuple | Compute capability (e.g., (8, 0)) |
total_memory |
int | Total VRAM in bytes |
supports_bf16 |
bool | BF16 tensor core support (Ampere+) |
supports_fp8 |
bool | FP8 support (Hopper+) |
GPU Generation Detection¶
props = get_device_properties()
cc = props['compute_capability']
if cc >= (9, 0):
print("Hopper (H100) - FP8 support")
elif cc >= (8, 0):
print("Ampere (A100/A10) - BF16 tensor cores")
elif cc >= (7, 0):
print("Volta/Turing (V100/T4)")
else:
print("Older GPU")
Triton Availability¶
Triton requires:
- Linux operating system
- NVIDIA GPU with CUDA
- Python 3.8+
On other platforms, is_triton_available() returns False and all kernels automatically fall back to PyTorch.