Abstract
Queue computing delivers an attractive alternative for embedded systems. The main features of a queue-based processor are a dense instruction set, high-parallelism capabilities, and low hardware complexity. This paper presents the design of a code generation algorithm implemented in the queue compiler infrastructure to achieve high code density by using a queue-based instruction set processor. We present the efficiency of our code generation technique by comparing the code size and extracted parallelism for a set of embedded applications against a set of conventional embedded processors. The compiled code is, in average, 12.03% more compact than MIPS16 code, and 45.1% more compact than ARM/Thumb code. In addition, we show that the queue compiler, without optimizations, can deliver about 1.16 times more parallelism than fully optimized code for a register machine.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Liao, S.Y., Devadas, S., Keutzer, K.: Code density optimization for embedded DSP processors using data compression techniques. In: Proceedings of the 16th Conference on Advanced Research in VLSI (ARVLSI 1995), p. 272 (1995)
Wolfe, A., Chanin, A.: Executing compressed programs on an embedded RISC architecture. In: Proceedings of the 25th annual international symposium on Microarchitecture, pp. 81–91 (1992)
Gordon-Ross, A., Cotterell, S., Vahid, F.: Tiny instruction caches for low power embedded systems. ACM Transactions on Embedded Computing Systems (TECS) 2(4), 449–481 (2003)
Koopman, P.J.: Stack Computers: the new wave. Ellis Horwood (1989)
Vijaykrishnan, N.: Issues in the Design of a Java Processor Architecture. PhD thesis, University of South Florida (1998)
Shi, H., Bailey, C.: Investigating Available Instruction Level Parallelism for Stack Based Machine Architectures. In: Proceedings of the Digital System Design, EUROMICRO Systems on (DSD 2004), pp. 112–120 (2004)
Sowa, M., Abderazek, B., Yoshinaga, T.: Parallel Queue Processor Architecture Based on Produced Order Computation Model. Journal of Supercomputing 32(3), 217–229 (2005)
Abderazek, B., Yoshinaga, T., Sowa, M.: High-Level Modeling and FPGA Prototyping of Produced Order Parallel Queue Processor Core. Journal of Supercomputing 38(1), 3–15 (2006)
Abderazek, B., Kawata, S., Sowa, M.: Design and Architecture for an Embedded 32-bit QueueCore. Journal of Embedded Computing 2(2), 191–205 (2006)
Heath, L.S., Pemmaraju, S.V.: Stack and Queue Layouts of Directed Acyclic Graphs: Part I. SIAM Journal on Computing 28(4), 1510–1539 (1999)
Canedo, A.: Code Generation Algorithms for Consumed and Produced Order Queue Machines. Master’s thesis, University of Electro-Communications, Tokyo, Japan (September 2006)
Goudge, L., Segars, S.: Thumb: Reducing the Cost of 32-bit RISC Performance in Portable and Consumer Applications. In: Proceedings of COMPCON 1996, pp. 176–181 (1996)
Kissel, K.: MIPS16: High-density MIPS for the embedded market. Technical report, Silicon Graphics MIPS Group (1997)
Kane, G., Heinrich, J.: MIPS RISC Architecture. Prentice-Hall, Englewood Cliffs (1992)
Krishnaswamy, A., Gupta, R.: Profile Guided Selection of ARM and Thumb Instructions. In: ACM SIGPLAN conference on Languages, Compilers, and Tools for Embedded Systems, pp. 56–64 (2002)
Halambi, A., Shrivastava, A., Biswas, P., Dutt, N., Nicolau, A.: An Efficient Compiler Technique for Code Size Reduction using Reduced Bit-width ISAs. In: Proceedings of the Conference on Design, Automation and Test in Europe, p. 402 (2002)
Sheayun, L., Jaejin, L., Min, S.: Code Generation for a Dual Instruction Processor Based on Selective Code Transformation. LNCS, pp. 33–48. Springer, Heidelberg (2003)
Kwon, Y., Ma, X., Lee, H.J.: Pare: instruction set architecture for efficient code size reduction. Electronics Letters, 2098–2099 (1999)
Krishnaswamy, A., Gupta, R.: Enhancing the Performance of 16-bit Code Using Augmenting Instructions. In: Proceedings of the 2003 SIGPLAN Conference on Language, Compiler, and Tools for Embedded Systems, pp. 254–264 (2003)
Krishnaswamy, A.: Microarchitecture and Compiler Techniques for Dual Width ISA Processors. PhD thesis, University of Arizona (September 2006)
Preiss, B., Hamacher, C.: Data Flow on Queue Machines. In: 12th Int. IEEE Symposium on computer Architecture, pp. 342–351 (1985)
Okamoto, S.: Design of a Superscalar Processor Based on Queue Machine Computation Model. In: IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, pp. 151–154 (1999)
Wulf, W.: Evaluation of the WM Architecture. In: Proceedings of the 19th annual international symposium on Computer architecture, pp. 382–390 (1992)
Smelyanskiy, M.G., Tyson, S., Davidson, E.S.: Register queues: a new hardware/software approach to efficientsoftware pipelining. In: Proceedings of Parallel Architectures and Compilation Techniques, pp. 3–12 (2000)
Fernandes, M.: Using Queues for Register File Organization in VLIW Architectures. Technical Report ECS-CSG-29-97, University of Edinburgh (1997)
Schmit, H., Levine, B., Ylvisaker, B.: Queue Machines: Hardware Computation in Hardware. In: 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, p. 152 (2002)
Canedo, A., Abderazek, B., Sowa, M.: A GCC-based Compiler for the Queue Register Processor. In: Proceedings of International Workshop on Modern Science and Technology, pp. 250–255 (May 2006)
Merrill, J.: GENERIC and GIMPLE: A New Tree Representation for Entire Functions. In: Proceedings of GCC Developers Summit, pp. 171–180 (2003)
Novillo, D.: Design and Implementation of Tree SSA. In: Proceedings of GCC Developers Summit, pp. 119–130 (2004)
Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T.M., Mudge, T., Brown, R.B.: MiBench: A free, commercially representative embedded benchmark suite. In: IEEE 4th Annual Workshop on Workload Characterization, pp. 3–14 (2001)
Lee, C., Potkonjak, M., Mangione-Smith, W.: MediaBench: a tool for evaluating and synthesizing multimedia and communications systems. In: 30th Annual International Symposium on Microarchitecture (Micro 1997), p. 330 (1997)
Patankar, V., Jain, A., Bryant, R.: Formal verification of an ARM processor. In: Twelfth International Conference On VLSI Design, pp. 282–287 (1999)
Alpert, D., Avnon, D.: Architecture of the Pentium microprocessor. Micro. 13(3), 11–21 (1993)
Debray, S., Muth, R., Weippert, M.: Alias Analysis of Executable Code. In: Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pp. 12–24 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Canedo, A., Abderazek, B., Sowa, M. (2009). Compiler Support for Code Size Reduction Using a Queue-Based Processor. In: Stenström, P. (eds) Transactions on High-Performance Embedded Architectures and Compilers II. Lecture Notes in Computer Science, vol 5470. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00904-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-00904-4_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00903-7
Online ISBN: 978-3-642-00904-4
eBook Packages: Computer ScienceComputer Science (R0)