Every transistor earns its place.
Custom silicon discipline means no transistor is on-die without a measurable contribution to your model's execution graph. No general-purpose matrix engine overhead. No tensor format converters for formats your model doesn't use. No fallback execution paths for workloads that will never be scheduled.
Die floor plan.
Abstract representation of the PCU-1 eval die — sectors partitioned by function, sized by workload contribution. Not a production die photo.
From application to silicon — five layers, no gaps.
Every layer is co-designed with the hardware. The application sees a familiar inference API; the silicon executes exactly the operations specified by the frozen model graph.
Integration characteristics
Standard rack. Standard OS. Non-standard performance.
Procunit hardware ships as a standard half-height, half-length PCIe card — the same form factor and power rails your infrastructure team works with today. Procunit is not a network-attached accelerator that requires dedicated fabric. It's not a compute blade that demands custom rack units. It drops into an existing server slot.
Driver installation follows the standard Linux kernel module pattern. The OS enumerates the device at PCIe initialization; Procunit's runtime layer handles all model-specific execution scheduling above the kernel boundary. Your existing monitoring stack reads IPMI and Redfish telemetry without modification.
- CUDA-compatible shim allows PyTorch / TensorFlow models to run without framework changes
- Monitoring via standard IPMI / Redfish interfaces for existing ops tooling
- Driver signing compatible with UEFI Secure Boot
- Thermal: operates within standard server rack ambient temperature range (5–40°C)
$ lspci | grep -i procunit
03:00.0 Processing accelerators:
Procunit Inc. PCU-1 Eval (rev 01)
$ cat /sys/bus/pci/devices/0000:03:00.0/power/runtime_status
active
$ procunit-cli status
PCU-1 [0000:03:00.0] — operational
Model: locked (gpt2-inference-v3)
TDP utilization: 62% (93W / 150W)
Start with your model graph.
Share your frozen model in a free initial consultation. NDA-first, no commitment.
Request Evaluation