This technical evaluation dealt with the performance measurement of cryptographic algorithms on a variety of embedded platforms, with a special focus on the comparison between dedicated hardware acceleration and pure software implementation.
Methodology & Comparability
To ensure objective comparability across different architectures, OpenSSL was used as a uniform test interface.
- Interface: Usage of
openssl speedandopenssl speed -evp. - Hardware Integration: On platforms like the TI Jacinto 7, the
devcrypto-engine was integrated to transparently delegate cryptographic operations to the hardware accelerators. - Algorithmics: Both symmetric methods (AES-CBC/ECB, SHA-256/512, ChaCha20) and asymmetric methods (RSA, ECDSA/ECDH with NIST and Brainpool curves) were tested.
Evaluated Hardware Platforms
The comparison covered a broad spectrum to highlight differences in the crypto architecture:
- Automotive SoCs: TI Jacinto 7 (ARM Cortex-A72 with Crypto Hardware) and NXP S32G (HSE - Hardware Security Engine).
- Automotive MCUs: Infineon AURIX TC38 (Tricore) as a safety-critical controller.
- Reference Platforms: Jetson Nano, Raspberry Pi 3/4 (ARM Cortex-A53/A72 without/with deactivated extensions) and an x86 system as baseline.
Key Findings
1. Hardware Support is Crucial The biggest differentiator was the native support of specific algorithms in the hardware unit. Missing hardware support forced the system into software emulation, leading to significant performance losses and high CPU load.
2. Throughput vs. Latency (“Small Block Penalty”) An interesting result was the superiority of software emulation for very small data blocks (e.g., 16 Bytes). Here, the overhead from system calls (IOCTLs) and DMA setup outweighed the actual calculation time. For large data blocks (from 16 KB), this reversed massively: For example, on the TI Jacinto 7, a throughput increase of up to factor 40 compared to the software solution could be achieved.
3. System Offloading Besides pure speed, the reduction of CPU load was the most important advantage of hardware accelerators. This enabled the SoC to perform computation-intensive encryptions in the background, while the main cores were available for critical real-time applications.
The results were automatically extracted and graphically processed to enable well-founded architectural decisions for future automotive security systems.