Safety & security considerations in embedded vision applications
FPGAs and SoCs offer secure configuration, fall back configuration and trust zone executing.
Aaron Behmanand Adam Taylor
Embedded vision (EV) systems are used across a wide range of applications from Advanced Driver Assistance Systems (ADAS) to machine vision with stops in medical imaging, augmented reality and many other applications.
While the inclusion of an EV system adds significant benefits to the end application, it is incumbent on developers to also ensure that the inclusion of the system cannot result in loss of life, injury or damage to property. Achieving this requires that considering not only the safety of the design by following an engineering life cycle and agreed standards, but also considering the security of the EV system to prevent it from being modified either maliciously or otherwise.
The end application for the EV system will drive the safety and security requirements. For example, a consumer application will have significantly fewer requirements than an ADAS or machine vision system.
To aid us in these design considerations and safety and security requirements, there are several well-known international standards such as IEC61508, which acts as an umbrella for many electronic systems requiring functional safety. There are also more application specific standards such as ISO26262 for automotive applications, IEC62061 for machinery, and DO178 / DO254 for flight applications.
Additionally, commercial applications also require CE, UL or CSA marking depending upon the end market. Each of these standards comes with development and verification requirements which need to be realized within the implementing organization's engineering as well as delivery life cycle to ensure compliance.
While the heart of an EV system is the processing core, the system will typically contain an FPGA or programmable System on a Chip (SoC) within which can be addressed a number of these considerations raised thus far.
What do these standards actually mean?
Many of these safety standards define levels of safety with different names, from Safety Integrity Level (SIL) for IEC61508, to Design Assurance Level (DAL) for DO254, and Automotive SIL (ASIL) for ISO26262.
Within SIL, DAL and ASIL there are also a number of differing levels which can be applied to the applications depending upon the criticality of the application. Typically, these levels are defined by the number of hours to failure-or correctly specified as the Time to Failure in hours. While the differing standards are generally aligned, there are some differences as shown above (Figure 1).
When the design analysis is performed, it demonstrates how the required level for certification is achieved. Engineers tend to work with FIT rates, which is the reciprocal of the Time to Failure (hours). When working at the SIL 4 and DAL A levels, this requires a correctly architected system to achieve these requirements.
Systematic considerations
The development of safe systems requires excellent systems engineering practice with clearly defined and traceable requirements at each level of development (Figure 2).
As discussed above, the engineering life cycle will be determined by the end application and the resultant certification required. This life cycle will define the overall engineering approach to be taken from concept to production and disposal of the EV System.
It is within this life cycle that engineering review gates controlling the progress of the project must be defined. During these reviews, independent technical experts will examine requirements, designs, technical reports and test results to ensure the design maturity is suitable to progress to the next stage, or if further work needs to be performed to achieve the desired standard of evidence.
The engineering plan will also outline the verification and validation process at every level, which is undertaken to gain the body of evidence to achieve compliance against the applicable standard. This may require testing of the EV system across environmental operating ranges, dynamic vibration and shock. Subjecting the EV system to accelerated life testing also ensures that the operating life of the system can be achieved.
When it comes to security, consider the high-level issues engineers face attempting to secure their designs. These include the following:
Designing in quality
Obviously, depending upon the end application, component selection and manufacturing standards must be carefully considered to ensure compliance with the quality requirements for the application. When it comes to the processing core it's best to use FPGAs and SoC devices that are appropriately rated. Whether the application requires conformance to regular commercial quality standards or more demanding standards such as Industrial, Automotive, Aerospace, or Defense, the engineering team can build in quality from day one by specifying the correct component grade at the start of the project.
There are also a number of design techniques used to help achieve the stringent requirements of these standards. To help ensure the design meets the reliability requirements-often called the probability of success-reliability engineering techniques such as creating a reliability block diagram of the functions within the system and ensuring that any dangerous failure modes and single points of failure are eliminated, if necessary.
Within the design itself, perform Failure Mode Effect Criticality Analysis, (FMECA), but keep in mind that the level this is performed can vary on an application by application basis from functional block to component level. The FMECA will consider the potential failure modes, the next effect and the end effect upon the system. It will also consider if the fault can be detected by the build in a self-testing and monitoring system.
If developing a component-level FMECA then consider the Part Stress Analysis (PSA) of each component within the design to ensure that it is operating with the correct derating. The level of derating applied will depend upon the chosen standard commonly used. Standards include Department of Defense (Mil-STD 1547) and the European Space Agency (ESCC-Q-30-11A).
If a PSA is not performed, it is possible to use devices that will be over stressed and, as a result, could become the life limiting factor on the equipment. Failure of which may or may not lead to loss or degradation of the system depending upon the FMECA predictions. Finally, along with the reliability aspects, it's also critical to perform a threat analysis on the system that will determine the threats to the system based upon the use cases and the potential mitigation strategies for the threats identified.
Architecture Case Study
At the hardware level, it's important to consider the functionality of the system and how proper implementation of the functional safety and security will be achieved. While this can be implemented from scratch, it is much better to select components that already support these features, for example, the Xilinx Zynq All Programmable SoC.
The heart of any EV System is the image processing pipeline. This requires high-bandwidth processing ability combined with supervisory and control capability. The Zynq AP SoC enables a tightly-integrated architecture, as opposed to the traditional processor and FPGA combination.
This tighter integration between the processor and logic fabric not only allows for a better SAWP-C solution but also provides for a more secure system because the interaction between the two is not available externally for malicious or other access.
Within the electronic architecture, the embedded security architecture of the Zynq AP SoC can be used to provide for secure configuration. Within both the PS and the PL there is a three stage process to ensure system partitions are secure. These comprise a Hashed Message Authentication Code (HMAC), Advanced Encryption Standard (AES) Decryption and RSA Authentication. Both the AES and HMAC use 256-bit private keys while the RSA uses 2048-bit keys, the security architecture of the Zynq AP SoC also allows for JTAG access to be enabled or disabled.
These security features are enabled when upon generating the boot file and the configuration partitions for the non-volatile boot media. It is also possible to define a fall back partition such that should the initial first stage boot loader fail to load its application it will fall back to another copy of the application stored at a different memory location.
Once the device is successfully up-and-running, the ARM Trust Zone architecture can be used to implement orthogonal worlds which limits access to hardware functions within the Zynq AP SoC including programmable logic (PL) peripherals, as well as segment memory and L2 Cache to ensure secure and non-secure worlds limiting limit interaction.
When it comes to implementing the image processing pipeline within the Zynq AP SoC PL fabric, it's also possible to use Trust Zone to provide secure or non-secure access to IP cores within the programmable logic fabric. This enables secure access for critical aspects of the image processing chain preventing the ability for unauthorized changes to the configuration. The image processing pipeline can be implemented using either custom developed or modules from the IP library.
Some safety and security implementations (IEC61508, for example) may require isolation of design elements from each other, this may be as a result of the modular redundancy, differing safety areas or test functions. Enforcing physical separation between the identified zones by the use of the Isolation Design Flow (IDF) is supported for the Zynq when used with Vivado Design Suite (Figure 3).
Isolation Design Flow can be very useful when implementing majority voting within the processing chain or other control logic. Use of this ensures that the only interconnection between redundant modules is via trusted paths.
When it comes to implementing the design there are also a number of device and tool specific implementation considerations to use. Of course, the end application and overall engineering management plan will outline the necessity to implement these techniques.
• Use of Error Detecting and Correcting (EDAC) codes on memories, if necessary, can be combined with a scrubbing function which periodically reads and corrects the data in memory whether or not the application is accessing the memory
• Exploiting the Hamming difference when defining control words: increasing the Hamming distance between command words while requiring more bits to implement can help with the reliability of the design.
• For critical commands, use the ARM and FIRE approach which requires two separate commands to action critical functions.
• Use of EDAC codes on external communication interfaces
• Comprehensive Built-in Test (BIT) capability which can otherwise report on the health or status of the system. The Zynq XADC makes for a very capable element of a BIT system as it allows the device voltages and temperatures to be monitored along with the bringing in of external signals though the mux.
Wrap Up
There are a number of components, tools and development methodologies that are available to system developers for achieving the correct industrial standard for functional safety in embedded visions systems. To ensure meeting certifications for an EV system, it's important to start with correct identification of the applicable standards, then follow with generation of an appropriate engineering management plan that defines the engineering life cycle used to gather the necessary evidence to achieve certification.
Aaron Behman, Director of Strategic Marketing, Embedded Vision Xilinx Inc. (San Jose, CA, USA; www.xilinx.com) and Adam Taylor, Director Engineering & Training, Adiuvo Engineering (Harlow, Essex, UK; www.adiuvoengineering.com).