FiLMiC (the company behind the mobile app FiLMiC Pro) has announced that it has patented a highly efficient image remapping technology called Cubiform, which is said to work 4.75 times faster than conventional GPU pipelining methods. This sounds revolutionary at first glance, but on closer inspection it is only so to a limited extent. On closer inspection, it is actually just an applied matrix multiplication (which would probably not be patentable here in Europe as a description of an algorithm).
The idea is to combine various filter steps into a single LookUpTable. When changing a parameter of a subfilter, the entire Cubiform LUT must of course be recalculated, but the effect chain can always be rendered extremely quickly once the Cubiform LUT has been calculated.
In practice, such Cubiform LUTs must be somewhat more complex (and much larger) than simple typical LUTs, among other things because obviously masks can also be included in the calculation. Nevertheless, we can confirm from our own experience that the principle works effectively as long as you have access to enough fast memory for the Cubiform LUTs.
FiLMiC itself states that a GPU pipeline that would require 26.4 milliseconds of computing time could be completed in 5.6 ms with Cubiform. FiLMiC claims in the best marketing manner that Cubiform can thus outperform an equivalent GPU pipeline by 4.75 times. What is not mentioned, however, is the additional time required by the encoder.
Also concealed is the fact that the order of the effects in the effect chain must always follow certain rules and that many effects are not at all suitable for a pre-calculation in a combined LookUpTable. If an effect in the chain is to change the position of a pixel, for example, the entire concept no longer works.
In addition, the speed comparison with several individual effects is lame. If these effects were also programmed as a single "chain effect", many memory accesses could be omitted and the balance would look much better than with the Cubiform solution.
Progressive engine approaches are currently going in this second direction: Namely, an effect chain is "pre-compiled" into a fixed combination effect in the editing programme, which can then render much faster than the combination of individual effects. In theory, this would even speed up every type of effect.
In a next step, such effect chains could even be compiled according to FPGA IP, which would mean a further boost for real-time effects. However, we are not currently aware of any development in this direction. Up to now, we had suspected that Apple was aiming in this direction with its Afterburner Accelerator card for Mac Pro, but there are still no reliable signs of this...