Course Creator and Instructor
In the 21st century, embedded systems are the systems of future with cellular phones, smart-phones, tablets becoming the dominant platforms for computing and communication. The ubiquity of information and the associated need for the computation that accompanies it is driving this revolution only to be accelerated by the new paradigms such as the Internet-of-Things (IoT). These platforms are clearly very different in terms of their processing requirements which are very unique: real-time needs, high performance but at low energy, compact-code and data segments, and most importantly ever changing software stack. Such unique requirements have led to a complete redesign and reinvention of the both hardware and the software stack from ground up, for example, brand new processors such as ARM, DSPs, network processors were invented all the way up to new virtual machines such as Dalvik, new operating systems such as Android and new programming models and compiler optimizations. The goal of this course is to take a holistic view of the embedded system stack with a focus on processor architectures, instruction sets and the associated advanced compiler optimizations that take advantage of the same. Following are the segments that will be covered in the course:
Part I: Embedded Processor Architectures
- 1. Introduction to instruction level parallelism: Pipelining, RISC vs CISC, Very Large Instruction Words (VLIW) instruction sets, Hardware complexity (Superscalars) vs Compiler Optimizations (VLIWs) Tradeoffs
- 2. Design of Instruction Set Architectures: VLIW encoding, Exposing vs Hiding Architectural Details, RISC vs CISC ISAs, Opportunities for compilers, Dependences and Independences, Instruction bundling for VLIW, Compact instruction representation.
- 3. Embedded Micro-architectures: Scratch-pad: software managed memory,clustered register files, special arithmetic, addressing modes for special needs (DSPs), branches in embedded domains: speculation and predication, unbundling branches
Part II: Software Optimizations
- 4. Introduction to Compiler phases: Overall working of the compiler, overview of phases, intermediate representation, backend code generation issues
- 5. Register Allocation Foundation: RISC philosophy (load, store architecture), Live range analysis, Interference Graph, Graph Coloring Based Register Allocation, Live Range Splitting
- 6. Register Allocation for Embedded Processors: Post-pass register allocation, Allocation gaps and register reuse, Energy reduction due to reduced memory accesses, Differential register allocation, Register encoding, Hardware support, Increase in exposed registers, Software pipelining and energy reduction
- 7. Data Layouts for Embedded Processors: Auto addressing mode, Data layouts, Simple and general offset assignment problems, Address sequence optimizations, Memory coalescing, Data and code segment minimization
- 8. Data and Code Compaction: X-Y memory, Parallelizing Load/Stores in DSPs, Data replication, Performance vs Data Segment/ size, ARM vs Thumb code generation, Mixed code generation, Frequent values in embedded programs and their encoding, Data cache optimization via compaction.
- 9. Network Processors: Processing in the network, Network processors, Dual Bank Register Allocation for Network Processors, Multi-threading in network processors, Context switch and latency, Register allocation across threads to minimize latency
It is recommended that students who take this course have previously taken at least an undergraduate-level course in computer architecture. In addition, the students must have a strong background in C and/or C++.
Assignments and Evaluation
There will be a series of homework assignments spread throughout the semester on respective topics. Some homework could involve programming exercises.
In addition, there will be one or two projects: one based on embedded processor architectures and the second based on compiler based software optimizations. There will be a comprehensive final examination that covers all the topics. There will be no midterm.
Minimum Technical Requirements
Students may be required to purchase a Raspberry Pi 2 board (cost $35) for one of the projects.
- Browser and connection speed: An up-to-date version of Google Chrome or Firefox is strongly recommended. 2+ Mbps is recommended.
- Operating System: -Windows XP or higher with latest updates. -Mac OS X 10.6 or higher with latest updates. -Linux - Any recent distribution will work so long as you can install Python and OpenCV
- Virtual Machine - You will be provided a virtual machine (VM) useful for performing class assignments and projects. For the projects, the supplied resources are identical to those used to test your submissions. Details for downloading and installing the VM can be found on T-Square.
All Georgia Tech students are expected to uphold the Georgia Tech Academic Honor Code