Machine Learning and Artificial Intelligence for Embedded Systems
With advances in artificial intelligence research, more and more areas of application are emerging. These traditionally resource-hungry technologies are also increasingly finding their way into small, mobile and even wearable systems. This development is possible because specialised hardware as well as resourceful optimisations drastically reduce resource and energy requirements, whereas sound automation and templates increasingly remove hurdles for application.
Hybrid hardware platforms with AI acceleration
Consequently, part of our research consists of designing and implementing specialised hardware in the form of AI hardware accelerators. For this purpose, we use flexible FPGAs that allow us to implement arbitrary AI models on hardware in a cheap and efficient way.
The hybrid hardware platform Elastic Node, which is part of the Elastic AI ecosystem, can be cons
idered as both the fruit and instrument of our research and is designed to meet demands for usability, adaptability and monitorability, while at the same time being efficient.
For this reason, we are also investigating adequate hardware-related optimisation options for this particular area. After all, precise knowledge of the hardware allows us to find many different approaches that can be used to fully utilise the available resources. For example, there are potential savings by implementing individual model layers with quantized activation functions by means of LUTs. Other examples would be not activating certain model layers at every clock cycle, parallelisation in the ALU and many other approaches.
Contact: Chao Qian, M.Sc.
Optimisations for Neural Networks
Optimisation, both on the hardware and software side, generally plays a major role in our research.
It is a broad field; at the moment we primarily focus on highly quantised Neural Networks and Transformers.
In this context, we generally investigate optimisations for the training of networks while striving for the quickest and most efficient computations possible, for instance, through the use of separable convolutions.
In addition, in quantised neural networks (QNN), a low bit depth of two or fewer bits, in conjunction with the properties of FPGA s, allows operations within the network to be pre-computed to be stored in the LUTs of the configurable logic blocks for the shortest possible access time.
Contact: Lukas Einhaus, M.Sc.
Transformer networks, originally designed for translation tasks, are also very suitable in a modified form for the prediction of future time series values, which makes them particularly interesting for certain regulatory tasks. But in order to be used e.g. within distributed systems in a wastewater network, such as in the RIWWER project, they have to be made faster and more lightweight. In this case, again, quantisation is our technique of choice.
Contact: Tianheng Ling, M.Sc.
Spike sorting and online training for quantised neural networks
Another exciting application of quantised neural networks, which is currently being studied in the Sp:Ai:Ke research project, is the time series analysis of (real biological) neural signals.
The records of such signals always include whole bundles of signals as
well as a certain amount of background activity. To get a clear picture of individual neural activities, sophisticated signal processing is required, which in addition has to have adaptive qualities.
More precisely, adjustments to the signal classification become necessary at runtime (i.e. online) due to the neuronal drift; the deployed network must be continuously readjusted with only few exemplary data samples.
Contact: Leo Buron, M.Sc.
Hardware-aware neural architecture search
Finding the ideal neural architecture for a particular use case can take a lot of time and trial and error.
Neural Architecture Search (NAS for short) automates this design step and even finds architectures that outperform those created manually in some cases. Furthermore, even the required resource estimation can be automated by a Deep Neural Network.
In this way, the tailored deployment of neural networks is simplified greatly.
The method, which we currently focus on, is based on evolutionary algorithms and discovers optimal architectures by continuously mutating and weeding out contenders.
Our requirements typically include efficient operation on resource-constrained hardware, hence hardware costs such as latency or energy consumption are included as optimisation criteria in the architecture search.
Our ongoing studies once again focus on applications in the domains of signal processing and time series analysis, e.g. to find the best WaveNet-based architectures for the simulation of various, often complex audio effects.
Contact: Christopher Ringhofer, M.Sc.