Designing productively for a hybrid architecture

Feature

24 September 2021

By Patrice Brossard, EMEA Vertical Segment Manager (FPGAs and ASICs), Future Electronics

The inexorable progress of integration in semiconductors has been blurring the boundaries between different types of components for many years. By loading up one type of component with IP from a different kind, a sensor can become a machine-learning inference engine, a microcontroller can be made to behave like an applications processor, and non-volatile memory can provide a secure hardware root-of-trust – a capability that normally requires a dedicated secure element.

In late 2019, this blurring took another step forward with the merging of two types of system with almost entirely different sets of attributes: FPGAs and processors running real-time applications in a Linux operating environment. We say “different set of attributes” because an FPGA consists of a programmable hardware fabric that supports parallel processing, whereas a processor is a fixed Instruction Set Architecture (ISA) hardware platform that executes and supports serial processing of instruction threads.

The 2019 introduction of the Microchip PolarFire SoC FPGA (Figure 1) bolted together these two platforms into a single system-on-chip that provides:

The resources expected of a mid-density FPGA, including up to 461k logic elements, up to 1,420 18×18 math blocks, up to 33Mbits of user SRAM, eight fully-configurable PLLs, a high-speed DDR4 interface capable of data rates to 1.6Gbits/s, and multi-protocol transceivers operating at data rates to 12.7Gbits/s, with two hard-wired PCIe Gen2 end-points/root ports. Fabricated on a mature, low-power SONOS 28nm process, the FPGA achieves very low power consumption; see Figure 2 for comparisons with other devices.
A multi-core applications processor consisting of a quad-core 64-bit RISC-V core cluster (RV64GC application core: 64-bit RISC-V with 2x32kbyte L1 cache with error correction code (ECC) and virtual memory support, and a separate 64-bit RISC-V monitor core, a RV64IMAC: 64-bit RISC-V with 32-bit integer registers, multiplication/division, atomic and compressed-mode support. All five cores operate at a speed to 625MHz. The processor’s I/O cover two Gigabit Ethernet controllers, a USB 2.0 On-The-Go controller and two CAN interfaces.

With this combination, electronics system designers have a single chip with low power consumption, high thermal efficiency and defence-grade security of an FPGA, and the deterministic execution capabilities provided by a fast processor.

This hybrid SoC platform offers unique capabilities and advantages in applications such as systems operating in environments subject to extreme temperatures, Artificial Intelligence (AI) inferencing at the edge, security-conscious applications, aerospace and defence systems, and communications infrastructure.

The hybrid processor/FPGA architecture

At the heart of the PolarFire SoC FPGA is a deterministic, coherent CPU cluster of 4+1 RISC-V cores; see Figure 2. RISC-V is a free and open functional specification for a processor’s ISA, and is backed by a growing ecosystem of development professionals, specifications, software and other resources.

For the CPU cluster in the PolarFire SoC, Microchip has developed its own hardware architecture in collaboration with SiFive. A unique feature of the Microchip implementation is the freedom to turn off CPU branch prediction and make the memory sub-system fully deterministic. This eliminates all variation in execution time whilst maintaining the high processor performance provided by the four RISC-V cores and relying on the deterministic features of the PolarFire FPGA.

The fifth core, a monitor core, is used to manage the boot process and system configuration. Unlike the application processing cores, it does not include virtual memory support.

TAP2102 PolarFire and RTOS Fig 11 Designing productively for a hybrid architecture

Fig. 1: The PolarFire SoC’s architecture combines separate FPGA hardware and a CPU cluster

All the SoC’s memories feature ECC with single error detection, providing a very high level of data integrity – a mandatory requirement in safety-critical applications, for instance in aerospace systems.

PolarFire SoC FPGA’s low power is clearly advantageous in battery-powered systems or in any system, since there’s no need for a heat-sink or fan, reducing system cost, size and weight, and improving its reliability.

TAP2102 PolarFire and RTOS Fig 2 Designing productively for a hybrid architecture

Fig. 2: Comparison of power consumption between the PolarFire SoC and Arm Cortex-A microprocessor cores

Memory partition supports real-time Linux operation

Alongside the mid-density FPGA portion of the PolarFire SoC, Microchip has also implemented an architecture that provides a real-time deterministic multi-processing. To run operating-system software on a multi-core system, an MPU manufacturer can choose one of two types of multi-processing architecture:

In Symmetric Multi-Processing (SMP) all cores share the main memory. The cores in SMP are homogeneous, i.e., the OS treats each one equally. This means more identical cores can be added for increased performance.
InAsymmetric Multi-Processing (AMP) the OS treats cores differently; i.e., they don’t share the memory or peripherals. This allows the system designer to assign certain kinds of tasks to one core, while leaving other(s) free to run the OS, for example.

Certain features of a typical SMP architecture, such as branch prediction and cache misses, make it impossible for the SoC to operate deterministically. Execution time is inconsistent and cannot be guaranteed because every core is exposed to periodic interrupts. By contrast, the AMP implementation allows the user to carve out a part of the cache memory and reserve it for a real-time application’s exclusive use. The PolarFire SoC supports both SMP and AMP modes, giving that choice, and even allowing a change of modes during field updates. Once the AMP mode is configured in the PolarFire SoC, the real-time application can run on one of the application cores, a real-time core in which branch prediction has been turned off; see Figure 3.

This hardware structure supports the fully deterministic operation of real-time functions alongside the Linux OS. In addition, Interrupt Service Routing (ISR) execution times are deterministic – a claim which cannot be made as confidently for an SMP architecture implemented on an equivalent quad-core microprocessor based on Arm Cortex-A technology.

TAP2102 PolarFire and RTOS Fig 3a Designing productively for a hybrid architecture

TAP2102 PolarFire and RTOS Fig 3b Designing productively for a hybrid architecture

Fig. 3: In the PolarFire SoC’s AMP architecture, real-time functions access a dedicated portion of the L2 cache memory directly

Hybrid FPGA/MPU system design

A hybrid FPGA/MPU SoC offers a unique ability to meet the requirements of certain kinds of application with a single chip. An application that concurrently performs local inferencing based on a machine-learning model and controls in real time the operation of a safety-critical motor, for instance, can implement the AI functions in the PolarFire SoC’s FPGA and the safety-critical controls on the multi-core processor.

The presence of both an FPGA and an MPU on the same chip, however, means that the system design team has to straddle the two worlds – FPGA and microprocessor – and work in two independent design environments. Both the PolarFire SoC’s toolchains feed into a configurator which generates:

The ‘software’ configuration – the C data structures for initialising the memory map – which will be used in the SoftConsole integrated development environment (IDE);
The ‘hardware’ configuration – a so-called component – used in the Libero FPGA IDE.

The interaction between the two IDEs is shown in Figure 4.

TAP2102 PolarFire and RTOS Fig 4 Designing productively for a hybrid architecture

Fig. 4: The relationship between the Libero IDE for the FPGA and the SoftConsole IDE for the MPU

Design simulation is also supported by separate tools: Renode for the software running on the multi-core processor part, and ModelSim for the FPGA part of the SoC. Microchip has also made good provision for debugging the complex applications that will run on the PolarFire SoC. Its on-chip-debugging mechanism communicates through a JTAG interface to debug tools:

C code can be debugged using a traditional openOCD debugger;
The FPGA debug tool is more specialised, as a debug mechanism is embedded by default in the component to give access to any internal node of the FPGA matrix dynamically. A dedicated tool, Smartdebug, uses this internal debugging circuitry to provide an intuitive means to debug the FPGA-based part of the application.

Interestingly, the conditions for porting an application to the RISC-V environment are similar to those applying within the Arm environment. No two Arm core-based devices will have the same memory map and, equally, no two RISC-V-based systems will share the same memory map. So porting from one Arm core to another in principle requires the same effort as porting from an Arm core to a RISC-V core.

Design-friendly development environment

The PolarFire SoC, then, offers the advantage of integrating in a single chip both programmable hardware capabilities and a high-performance, multi-core platform for software applications. This hybrid architecture does entail the use of two development environments in parallel, but Microchip has taken great care to provide the user with a comprehensive set of tools which are extremely well integrated, and provide the capability for design teams to work productively in:

Creating or migrating the system;
Simulating the design;
Programming both the hardware resources in the FPGA portion of the chip, and application software running on the processor cluster;
Debugging the system.