Paper data
Title:
Efficient IDCT Implementations on VLIW Processors Author(s): Bagni Daniele, STMicroelectronics Borneo Antonio, STMicroelectronics Celetto Luca, STMicroelectronics Page numbers in the proceedings: Volume I pp 595-598 Session: Implementation
Paper abstract
In this paper we describe two efficient software implementations of bi-dimensional IDCT (Inverse Discrete Cosine Transform). Instead of using a traditional separation into eight horizontal and vertical mono-dimensional IDCT stages, we apply a novel approach to directly represent the bi-dimensional IDCT into only eight mono-dimensional units followed by a network of addition and subtraction operations. We have then optimized this method in pure ANSI-C for 32-bit architecture VLIW (Very Long Instruction Word) processors. By arranging the network structure in a proper way to exploit sub-word parallelism and by defining totally new multimedia instructions, we have implemented a second version that is 23% more efficient than the previous one. Our fixed-point arithmetic IDCT implementations are fully compliant with the IEEE 1180 standard, as required by most of the video compression standards.
Paper
|