Research Interests

My research interests are in the fields of operating systems, real-time/embedded systems, computer architecture and compilers. Over the last three to four years, I have been conducting active research in the field of real-time/embedded systems.


Static Timing Analysis

Real-time systems are those in which tasks (programs) have timing requirements in addition to requirements of logical correctness. Hard real-time systems are real-time systems in which the timing requirements must be satisfied in order to avoid catastrophic situations. Examples of hard real-time systems include medical devices such as pacemakers, emergency life-support systems, automotive engine control systems and aircraft and missile control systems.

Systems with periodic tasks executing in a hard real-time environment have been the focus of my research work conducted under the guidance of Dr. Frank Mueller at North Carolina State University. Such systems require knowledge of the worst-case execution times (WCETs) of all tasks a priori in order to guarantee schedulability.

Since safety is critical in hard real-time systems, static timing analysis is considered a viable approach for calculating upper bounds on the WCET of a task [8]. Static timing analysis models the architectural components of a computing system in order to obtain the time taken by the longest path identified in a given task.

Architectural features such as out-of-order pipelines and caches cause unpredictability in timing analysis, leading to overly pessimistic estimates of the WCET of a task. One such feature that is particularly hard to model is the data cache. I have devoted my research career so far to improving data cache characterization and thus increasing schedulability of task sets.

I had the opportunity to do an internship at Kidde Aerospace, working as part of a team that developed an Overheat Detection System for the Airbus A-380. During this period, I realized there is a real need for good static timing analysis tools that offer predictability and yet support architectural features such as a data cache. Currently, due to the lack of such a tool, hard-real time systems are forced to either turn off data caches or to provide a significant margin in the WCET calculations for all tasks in the system. The first option has the disadvantage of decreased performance while the second has the disadvantage of decreased schedulability. This realization served as a strong motivator in guiding my research direction.


Single Task Analysis

In the first phase of my research work, I enhanced a data cache analyzer framework known as the Cache Miss Equations framework [1, 6, 7]. The framework performed static data cache analysis to determine the maximum number of data cache misses that a task incurs. Mathematical equations that relate the iteration space of the task, the cache parameters and the memory references within the task are used to perform this calculation.

The new framework provides three fundamental enhancements over the original one and thus relaxes certain constraints that the original framework imposed on tasks. First, forced loop fusion is employed to concatenate loop bodies of sequential loop nests in order to build one single loop nest that may then be analyzed by the framework. Furthermore, a technique is proposed to allow non-rectangular loops in tasks. Second, the framework is extended to allow certain types of data dependent conditionals (those with only if-then clauses, but no else clauses). An upper bound on the number of data cache misses is provided in such a case. The third and most significant enhancement is that, while the original framework gave an upper bound on the number of data cache misses incurred by a task, the new one produces reference patterns that identify the positions of each of these misses as well. This data cache analysis framework was integrated into a static timing analyzer framework that calculates the WCET of a task. These enhancements helped pave the path to the next phase of my research.

This work resulted in a publication in the proceedings of IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2005 (acceptance rate - 33%).


Preemptive systems

Although data cache analysis for a single task is important to any timing analysis framework, it would be incomplete without considering the effects of interference between multiple tasks in a prioritized, preemptive environment. Hence, the next phase of my research was devoted to analyzing such systems and answering three important questions:

  • What is the maximum number of preemptions a task may undergo?
  • Where would each preemption be placed in the iteration space of the preempted task?
  • What is the additional delay incurred by the preempted task due to each of these preemptions?
By considering ranges of execution times between the best and the worst-case execution times, infeasible preemption points are eliminated and realistic worst-case scenarios are constructed. A range of iteration points where a task might be executing when it is preempted is identified. The delay at each point is calculated using access chains (reuse chains) and the maximum delay among them is considered to be the preemption delay at the given point. To the best of my knowledge, this framework is the first to integrate data cache analysis with preemption delay determination using response-time analysis. It is also the first time that critical instant of staggered task releases in such a context is considered.

Parts of this work resulted in a publication in the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2006 (acceptance rate - 29%) and the remainder in IEEE Real-Time Systems Symposium (RTSS), 2006 (acceptance rate - 24%). It also resulted in a publication in the Transactions on Embedded Systems (TECS), 2007.

Lee et al. [2, 3] and Staschulat et al. [4, 5] address the same three questions in the context of instruction caches. The methodology in those pieces of work is completely different and may be considered orthogonal to the work described here.


Tasks with Non-Preemptive Regions

The question of whether non-preemptive systems are better than preemptive systems or vice versa has been debated for a long time. Both have advantages and disadvantages. However, making a system preemptive or non-preemptive is not always a matter of choice. Some tasks have certain regions in them where they should not be preempted in order to maintain program correctness. Traditionally, such tasks have been scheduled completely non-preemptively to simplify timing analysis. Unfortunately, this suffers from the downside of decreased schedulability. Hence, I considered it worthwhile to develop a methodology to include tasks that have non-preemptive regions (NPRs) within them, but are otherwise preemptible.

Every region of a task has a range of possible execution times as calculated using best-case and worst-case timing analysis. Hence, there arise situations where, at a point in time when a task is released, a lower-priority task may have entered its NPR, but is not guaranteed to have done so. In this situation, it is not known whether the lower-priority task continues to execute or gets preempted. This uncertainty affects the worst-case response time calculations for both tasks. By considering parallel executions of worst-case scenarios for each task, safe bounds on the response times for both tasks are calculated. To the best of my knowledge, this is the first framework that calculates data cache related preemption delay and response times for tasks with non-preemptive regions.

This work resulted in a publication to appear in the proceedings of IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2008 and is currently under review.


Current Work

Throughout my research work thus far, a direct-mapped data cache has been assumed. I am currently working on extending the data cache analyzer to support to set-associative caches. This extension is non-trivial, especially in the presence of the data-dependent conditionals allowed. In the case of direct-mapped caches, the worst case scenario was to assume that the condition is always satisfied. However, in the case of set-associative caches, the replacement policy plays a role in determining whether it is worse to assume that the condition is satisfied or to assume that it is not. Access chains are used to calculate whether an element that has been brought into the cache by a prior reference is still guaranteed to be in cache or not. I am working on further generalizing this analysis to efficiently use access chains to estimate misses irrespective of the actual replacement policy being used.


Future Work

  • Short-term goals
    Shortcomings of current single-task analysis:
    Data caches have some inherent difficulties that are a challenge to overcome. One such difficulty is calculating the WCET of a task in the presence of generic (if-then-else), data-dependent conditionals. The cache hit/miss status of an access in a particular iteration depends on accesses in previous iterations. Assuming the worst order of accesses for every iteration in isolation need not necessarily result in overall worst-case behavior. Hence, such an assumption could jeopardize safety of the analysis. By imposing certain constraints that are reasonable in practice, analysis methods may be developed to characterize data cache behavior for programs with generic data-dependent conditionals and I wish to pursue such methods in the near future.
  • Medium/long-term goals
    Cyber physical systems:
    Cyber physical systems are integrations of computations with physical processes that are finding an increasing number of applications in today’s world. The usage of such systems in a real-time environment finds applications in assisted living, smart clothes, avionics, automotive networks, disaster response, etc. to mention a few. Dependability, efficiency and performance in computation, communication and control are primary requirements of such systems. Special workshops have been conducted by funding agencies such as the National Science Foundation (NSF) recently to discuss research challenges and ideas in this field.
    Due to the inherent unpredictability introduced by data caches in timing analysis, several cyber-physical systems are tending towards usage of scratchpad memories as a substitute. Data caches have the advantaged of being managed generically by hardware while scratchpad memories need to be explicitly managed through software for each task. On the other hand, scratchpad memories have the advantage of increased predictability.
    I wish to study the pros and cons of using data caches over using scratchpad memories and potentially develop methods of analysis to make their combined usage viable in cyber physical systems. In order to make data cache analysis feasible for such systems, every component (computation, communication and control) of a task may be modeled as a separate sub-task with precedence constraints introduced to maintain correctness.

    Multi-core systems:
    Starting from personal computers to the most advanced computing machines, we are progressing towards multi-core architectures. Several embedded multi-core architectures are already in existence. Architects of such systems have several options for cache organization, each of which has its own advantages in terms of system performance, cost and simplicity. Adding real-time requirements to embedded multi-core systems brings to light a whole new set of challenges in program analysis. Cache characterization for data-sharing tasks now requires consideration of issues such as cache coherence and false sharing. I wish to pursue research in this area and develop program analysis solutions for embedded multi-core architectures.

    While these are some high-level ideas in areas closely related to my current research, I wish to broaden my perspectives and explore research possibilities in other areas of computer systems in collaboration with researchers in those areas. Being in a field such as embedded systems that finds a wide variety of applications, there are several possibilities for me to branch out into areas intersecting with fields such as computer architecture and network security.


References

  1. GHOSH, S., MARTONOSI, M., AND MALIK, S. 1999. Cache miss equations: a compiler framework for analyzing and tuning memory behavior. ACM Transactions on Programming Languages and Systems 21, 4,703.746.
  2. LEE, C.-G., HAHN, J., SEO, Y.-M., MIN, S. L., HA, R., HONG, S., PARK, C. Y., LEE, M., AND KIM, C. S. 1998. Analysis or cache-related preemption delay in fixed-priority preemptive scheduling. IEEE Transactions on Computers 47(6), 700.713.
  3. LEE, C.-G., LEE, K., HAHN, J., SEO, Y.-M., MIN, S. L., HA, R., HONG, S., PARK, C. Y., LEE, M., AND KIM, C. S. 2001. Bounding cache-related preemption delay for real-time systems. IEEE Transactions on Software Engineering 27(9), 805.826.
  4. STASCHULAT, J. AND ERNST, R. 2004. Multiple process execution in cache related preemption delay analysis. In ACM International Conference on Embedded Software.
  5. STASCHULAT, J. AND ERNST, R. 2006. Worst case timing analysis of input dependent data cache behavior. In Euromicro Conference on Real-Time Systems.
  6. VERA, X., LLOSA, J., GONZ ŽALEZ, A., AND BERMUDO, N. 2000. A fast and accurate approach to analyze cache memory behavior (research note). Lecture Notes in Computer Science 1900, 194.198.
  7. VERA, X. AND XUE, J. 2002. Let's study whole-program cache behavior analytically. In International Symposium on High Performance Computer Architecture. IEEE.
  8. WEGENER, J. AND MUELLER, F. 2001. A comparison of static analysis and evolutionary testing for the verification of timing constraints. Real-Time Systems 21, 3 (Nov.), 241.268.