| 2009 |
| 201 | EE | Guangming Tan,
Ninghui Sun,
Guang R. Gao:
Improving Performance of Dynamic Programming via Parallelism and Locality on Multicore Architectures.
IEEE Trans. Parallel Distrib. Syst. 20(2): 261-274 (2009) |
| 2008 |
| 200 | | Barbara M. Chapman,
Weimin Zheng,
Guang R. Gao,
Mitsuhisa Sato,
Eduard Ayguadé,
Dongsheng Wang:
A Practical Programming Model for the Multi-Core Era, 3rd International Workshop on OpenMP, IWOMP 2007, Beijing, China, June 3-7, 2007, Proceedings
Springer 2008 |
| 199 | EE | Yuan Zhang,
Vugranam C. Sreedhar,
Weirong Zhu,
Vivek Sarkar,
Guang R. Gao:
Minimum Lock Assignment: A Method for Exploiting Concurrency among Critical Sections.
LCPC 2008: 141-155 |
| 198 | EE | Guangming Tan,
Vugranam C. Sreedhar,
Guang R. Gao:
Just-In-Time Locality and Percolation for Optimizing Irregular Applications on a Manycore Architecture.
LCPC 2008: 331-342 |
| 197 | EE | Guangming Tan,
Dongrui Fan,
Junchao Zhang,
Andrew Russo,
Guang R. Gao:
Experience on optimizing irregular computation for memory hierarchy in manycore architecture.
PPOPP 2008: 279-280 |
| 196 | EE | Hongbo Rong,
Alban Douillet,
Guang R. Gao:
Register allocation for software pipelined multidimensional loops.
ACM Trans. Program. Lang. Syst. 30(4): (2008) |
| 195 | EE | Guang R. Gao,
Mitsuhisa Sato,
Eduard Ayguadé:
Guest Editors Introduction: Special Issue on OpenMP.
International Journal of Parallel Programming 36(3): 287-288 (2008) |
| 2007 |
| 194 | EE | Peiheng Zhang,
Guangming Tan,
Guang R. Gao:
Implementation of the Smith-Waterman algorithm on a reconfigurable supercomputing platform.
HPRCTA 2007: 39-48 |
| 193 | EE | Guang R. Gao,
Thomas L. Sterling,
Rick Stevens,
Mark Hereld,
Weirong Zhu:
ParalleX: A Study of A New Parallel Computation Model.
IPDPS 2007: 1-6 |
| 192 | EE | Haiping Wu,
Eunjung Park,
Mihailo Kaplarevic,
Yingping Zhang,
Murat Bolat,
Xiaoming Li,
Guang R. Gao:
Automatic Program Segment Similarity Detection in Targeted Program Performance Improvement.
IPDPS 2007: 1-8 |
| 191 | EE | Daniel Orozco,
Liping Xue,
Murat Bolat,
Xiaoming Li,
Guang R. Gao:
Experience of Optimizing FFT on Intel Architectures.
IPDPS 2007: 1-8 |
| 190 | EE | Ge Gan,
Ziang Hu,
Juan del Cuvillo,
Guang R. Gao:
Exploring a Multithreaded Methodology to Implement a Network Communication Protocol on the Cyclops-64 Multithreaded Architecture.
IPDPS 2007: 1-8 |
| 189 | EE | Weirong Zhu,
Ziang Hu,
Guang R. Gao:
On the Role of Deterministic Fine-Grain Data Synchronization for Scientific Applications: A Revisit in the Emerging Many-Core Era.
IPDPS 2007: 1-8 |
| 188 | EE | Weirong Zhu,
Ziang Hu,
Guang R. Gao:
On the Role of Deterministic Fine-Grain Data Synchronization for Scientific Applications: A Revisit in the Emerging Many-Core Era.
IPDPS 2007: 1-8 |
| 187 | EE | Long Chen,
Ziang Hu,
Junmin Lin,
Guang R. Gao:
Optimizing the Fast Fourier Transform on a Multi-core Architecture.
IPDPS 2007: 1-8 |
| 186 | EE | Weirong Zhu,
Vugranam C. Sreedhar,
Ziang Hu,
Guang R. Gao:
Synchronization state buffer: supporting efficient fine-grain synchronization on many-core architectures.
ISCA 2007: 35-45 |
| 185 | EE | Yuan Zhang,
Evelyn Duesterwald,
Guang R. Gao:
Concurrency Analysis for Shared Memory Programs with Textually Unaligned Barriers.
LCPC 2007: 95-109 |
| 184 | EE | Guang R. Gao:
On Parallel Models of Computation.
NPC 2007: 541 |
| 183 | EE | Alban Douillet,
Guang R. Gao:
Software-Pipelining on Multi-Core Architectures.
PACT 2007: 39-48 |
| 182 | EE | Yuan Zhang,
Vugranam C. Sreedhar,
Weirong Zhu,
Vivek Sarkar,
Guang R. Gao:
Optimized lock assignment and allocation: a method for exploiting concurrency among critical sections.
PPOPP 2007: 146-147 |
| 181 | EE | Guangming Tan,
Ninghui Sun,
Guang R. Gao:
A parallel dynamic programming algorithm on a multi-core architecture.
SPAA 2007: 135-144 |
| 180 | EE | Weirong Zhu,
Yanwei Niu,
Guang R. Gao:
Performance portability on EARTH: a case study across several parallel architectures.
Cluster Computing 10(2): 115-126 (2007) |
| 179 | EE | Hongbo Rong,
Zhizhong Tang,
Ramaswamy Govindarajan,
Alban Douillet,
Guang R. Gao:
Single-dimension software pipelining for multidimensional loops.
TACO 4(1): (2007) |
| 2006 |
| 178 | EE | Guang R. Gao:
The Era of Multi-core Chips -A Fresh Look on Software Challenges.
Asia-Pacific Computer Systems Architecture Conference 2006: 1 |
| 177 | EE | Juan del Cuvillo,
Weirong Zhu,
Guang R. Gao:
Landing openMP on cyclops-64: an efficient mapping of openMP to a many-core system-on-a-chip.
Conf. Computing Frontiers 2006: 41-50 |
| 176 | EE | Ziang Hu,
Juan del Cuvillo,
Weirong Zhu,
Guang R. Gao:
Optimization of Dense Matrix Multiplication on IBM Cyclops-64: Challenges and Experiences.
Euro-Par 2006: 134-144 |
| 175 | EE | Alban Douillet,
Hongbo Rong,
Guang R. Gao:
Multi-dimensional Kernel Generation for Loop Nest Software Pipelining.
Euro-Par 2006: 311-322 |
| 174 | EE | Juan del Cuvillo,
Weirong Zhu,
Ziang Hu,
Guang R. Gao:
Toward a Software Infrastructure for the Cyclops-64 Cellular Architecture.
HPCS 2006: 9 |
| 173 | EE | Yingping Zhang,
Taikyeong Jeong,
Fei Chen,
Haiping Wu,
R. Nitzsche,
Guang R. Gao:
A study of the on-chip interconnection network for the IBM Cyclops64 multi-core architecture.
IPDPS 2006 |
| 172 | EE | Guang R. Gao,
Thomas L. Sterling,
Rick L. Stevens,
Mark Hereld,
Weirong Zhu:
Hierarchical multithreading: programming model and system software.
IPDPS 2006 |
| 171 | EE | Weirong Zhu,
Parimala Thulasiraman,
Ruppa K. Thulasiram,
Guang R. Gao:
Exploring Financial Applications on Many-Core-on-a-Chip Architecture: A First Experiment.
ISPA Workshops 2006: 221-230 |
| 170 | | Haiping Wu,
Eunjung Park,
Long Chen,
Juan del Cuvillo,
Guang R. Gao:
User-Friendly Methodology for Automatic Exploration of Compiler Options: A Case Study on the Intel XScale Microarchitecture.
Software Engineering Research and Practice 2006: 866-872 |
| 169 | | Haiping Wu,
Long Chen,
Joseph Manzano,
Guang R. Gao:
A User-Friendly Methodology for Automatic Exploration of Compiler Options.
Software Engineering Research and Practice 2006: 873-882 |
| 2005 |
| 168 | | Haiping Wu,
Ziang Hu,
Joseph Manzano,
Yingping Zhang,
Guang R. Gao:
Identifying Multiply-Add Operations in Kylin Compiler.
ESA 2005: 81-87 |
| 167 | EE | Weirong Zhu,
Yanwei Niu,
Guang R. Gao:
Performance Portability on EARTH: A Case Study across Several Parallel Architectures.
IPDPS 2005 |
| 166 | EE | Guang R. Gao:
Sustained Petaflop and Beyond: Can Parallel Computing Systems Meet The Challenges?
IPDPS 2005 |
| 165 | EE | Juan del Cuvillo,
Weirong Zhu,
Ziang Hu,
Guang R. Gao:
TiNy Threads: A Thread Virtual Machine for the Cyclops64 Cellular Architecture.
IPDPS 2005 |
| 164 | EE | Dongrui Fan,
Zhimin Tang,
Hailin Huang,
Guang R. Gao:
An energy efficient TLB design methodology.
ISLPED 2005: 351-356 |
| 163 | EE | Alban Douillet,
Guang R. Gao:
Register Pressure in Software-Pipelined Loop Nests: Fast Computation and Impact on Architecture Design.
LCPC 2005: 17-31 |
| 162 | EE | Yanwei Niu,
Ziang Hu,
Kenneth E. Barner,
Guang R. Gao:
Performance Modelling and Optimization of Memory Access on Cellular Computer Architecture Cyclops64.
NPC 2005: 132-143 |
| 161 | EE | Hongbo Rong,
Alban Douillet,
Guang R. Gao:
Register allocation for software pipelined multi-dimensional loops.
PLDI 2005: 154-167 |
| 160 | | Yuan Zhang,
Weirong Zhu,
Fei Chen,
Ziang Hu,
Guang R. Gao:
Sequential Consistency Revisit: The Sufficient Condition and Method to Reason the Consistency Model of a Multiprocessor-on-a-Chip Architecture.
Parallel and Distributed Computing and Networks 2005: 13-19 |
| 159 | EE | Robel Y. Kahsay,
Guoli Wang,
Guang R. Gao,
Li Liao,
Roland L. Dunbrack Jr.:
Quasi-consensus-based comparison of profile hidden Markov models for protein sequences.
Bioinformatics 21(10): 2287-2293 (2005) |
| 158 | EE | Robel Y. Kahsay,
Guang R. Gao,
Li Liao:
An improved hidden Markov model for transmembrane protein detection and topology prediction and its applications to complete genomes.
Bioinformatics 21(9): 1853-1858 (2005) |
| 157 | EE | Haiping Wu,
Ziang Hu,
Joseph Manzano,
Guang R. Gao:
Madd Operation Aware Redundancy Elimination.
International Journal of Software Engineering and Knowledge Engineering 15(2): 357-362 (2005) |
| 156 | EE | Hongbo Yang,
Ramaswamy Govindarajan,
Guang R. Gao,
Ziang Hu:
Improving power efficiency with compiler-assisted cache replacement.
J. Embedded Computing 1(4): 487-499 (2005) |
| 2004 |
| 155 | | Bernd Kleinjohann,
Guang R. Gao,
Hermann Kopetz,
Lisa Kleinjohann,
Achim Rettberg:
Design Methods and Applications for Distributed Embedded Systems, IFIP 18th World Computer Congress, TC10 Working Conference on Distributed and Parallel Embedded Systems (DIPES 2004), 22-27 August 2004, Toulouse, France
Kluwer 2004 |
| 154 | | Laurence Tianruo Yang,
Minyi Guo,
Guang R. Gao,
Niraj K. Jha:
Embedded and Ubiquitous Computing, International Conference EUC 2004, Aizu-Wakamatsu City, Japan, August 25-27, 2004, Proceedings
Springer 2004 |
| 153 | | Hai Jin,
Guang R. Gao,
Zhiwei Xu,
Hao Chen:
Network and Parallel Computing, IFIP International Conference, NPC 2004, Wuhan, China, October 18-20, 2004, Proceedings
Springer 2004 |
| 152 | EE | Hongbo Rong,
Zhizhong Tang,
Ramaswamy Govindarajan,
Alban Douillet,
Guang R. Gao:
Single-Dimension Software Pipelining for Multi-Dimensional Loops.
CGO 2004: 163-174 |
| 151 | EE | Hongbo Rong,
Alban Douillet,
Ramaswamy Govindarajan,
Guang R. Gao:
Code Generation for Single-Dimension Software Pipelining of Multi-Dimensional Loops.
CGO 2004: 175-188 |
| 150 | EE | Fei Chen,
Kevin B. Theobald,
Guang R. Gao:
Implementing parallel conjugate gradient on the EARTH multithreaded architecture.
CLUSTER 2004: 459-469 |
| 149 | EE | Arthur Stoutchinin,
Guang R. Gao:
If-Conversion in SSA Form.
Euro-Par 2004: 336-345 |
| 148 | EE | Robel Y. Kahsay,
Li Liao,
Guang R. Gao:
An Improved Hidden Markov Model for Transmembrane Topology Prediction.
ICTAI 2004: 634-639 |
| 147 | EE | Weirong Zhu,
Yanwei Niu,
Jizhu Lu,
Chuan Shen,
Guang R. Gao:
A cluster-based solution for high performance hmmpfam using EARTH execution model.
IJHPCN 2(2/3/4): 66-76 (2004) |
| 146 | EE | Parimala Thulasiraman,
Ashfaq A. Khokhar,
Gerd Heber,
Guang R. Gao:
A fine-grain load-adaptive algorithm of the 2D discrete wavelet transform for multithreaded architectures.
J. Parallel Distrib. Comput. 64(1): 68-78 (2004) |
| 2003 |
| 145 | EE | Weirong Zhu,
Yanwei Niu,
Jizhu Lu,
Chuan Shen,
Guang R. Gao:
A Cluster-Based Solution for High Performance Hmmpfam Using EARTH Execution Model.
CLUSTER 2003: 30-37 |
| 144 | EE | Weirong Zhu,
Yanwei Niu,
Jizhu Lu,
Guang R. Gao:
Implementing Parallel Hmm-pfam on the EARTH Multithreaded Architecture.
CSB 2003: 549-550 |
| 143 | EE | Liu Yang,
Sun Chan,
Guang R. Gao,
Roy Ju,
Guei-Yuan Lueh,
Zhaoqing Zhang:
Inter-procedural stacked register allocation for itanium® like architecture.
ICS 2003: 215-225 |
| 142 | EE | Guang R. Gao,
Kevin B. Theobald,
Ramaswamy Govindarajan,
Clement Leung,
Ziang Hu,
Haiping Wu,
Jizhu Lu,
Juan del Cuvillo,
Adeline Jacquet,
Vincent Janot,
Thomas L. Sterling:
Programming Models and System Software for Future High-End Computing Systems: Work-in-Progress.
IPDPS 2003: 206 |
| 141 | EE | Adeline Jacquet,
Vincent Janot,
Clement Leung,
Guang R. Gao,
Ramaswamy Govindarajan,
Thomas L. Sterling:
An Executable Analytical Performance Evaluation Approach for Early Performance Prediction.
IPDPS 2003: 268 |
| 140 | EE | Andrés Márquez,
Guang R. Gao:
CARE: Overview of an Adaptive Multithreaded Architecture.
ISHPC 2003: 26-38 |
| 139 | EE | Juan del Cuvillo,
Xinmin Tian,
Guang R. Gao,
Milind Girkar:
Performance Study of a Whole Genome Comparison Tool on a Hyper-Threading Multiprocessor.
ISHPC 2003: 450-457 |
| 138 | EE | Hongbo Yang,
Ramaswamy Govindarajan,
Guang R. Gao,
Ziang Hu:
Compiler-Assisted Cache Replacement: Problem Formulation and Performance Evaluation.
LCPC 2003: 77-92 |
| 137 | EE | Guang R. Gao,
Trevor N. Mudge:
Special issue on compilers, architecture, and synthesis for embedded systems.
ACM Trans. Embedded Comput. Syst. 2(2): 131 (2003) |
| 136 | EE | Guy Tremblay,
C. J. Morrone,
José Nelson Amaral,
Guang R. Gao:
Implementation of the EARTH programming model on SMP clusters: a multi-threaded language and runtime system.
Concurrency and Computation: Practice and Experience 15(9): 821-844 (2003) |
| 135 | EE | Ramaswamy Govindarajan,
Hongbo Yang,
José Nelson Amaral,
Chihong Zhang,
Guang R. Gao:
Minimum Register Instruction Sequencing to Reduce Register Spills in Out-of-Order Issue Superscalar Architectures.
IEEE Trans. Computers 52(1): 4-20 (2003) |
| 2002 |
| 134 | EE | Hongbo Yang,
Guang R. Gao,
Clement Leung:
On achieving balanced power consumption in software pipelined loops.
CASES 2002: 210-217 |
| 133 | EE | Yujing Zeng,
Jianshan Tang,
Javier Garcia-Frias,
Guang R. Gao:
An Adaptive Meta-Clustering Approach: Combining the Information from Different Clustering Results.
CSB 2002: 276- |
| 132 | EE | Rishi Khan,
Yujing Zeng,
Javier Garcia-Frias,
Guang R. Gao:
A Bayesian Modeling Framework for Genetic Regulation.
CSB 2002: 330-332 |
| 131 | EE | Hongbo Yang,
Ramaswamy Govindarajan,
Guang R. Gao,
Kevin B. Theobald:
Power-Performance Trade-Offs for Energy-Efficient Architectures: A Quantitative Study.
ICCD 2002: 174-179 |
| 130 | EE | Praveen R. Thiagarajan,
Guang R. Gao:
Visualizing Biosequence Data Using Texture Mapping.
INFOVIS 2002: 103-109 |
| 129 | EE | Rishi Kumar,
Gagan Agrawal,
Guang R. Gao:
Compiling Several Classes of Communication Patterns on a Multithreaded Architecture.
IPDPS 2002 |
| 128 | EE | Guang R. Gao,
Kevin B. Theobald,
Ziang Hu,
Haiping Wu,
Jizhu Lu,
Keshav Pingali,
Paul Stodghill,
Thomas L. Sterling,
Rick Stevens,
Mark Hereld:
Next Generation System Software for Future High-End Computing Systems.
IPDPS 2002 |
| 127 | EE | Alban Douillet,
José Nelson Amaral,
Guang R. Gao:
Fine-Grain Stacked Register Allocation for the Itanium Architecture.
LCPC 2002: 344-361 |
| 126 | | Robel Y. Kahsay,
Guoli Wang,
Nataraj Dongre,
Guang R. Gao,
Roland L. Dunbrack Jr.:
CASA: a server for the critical assessment of protein sequence alignment accuracy.
Bioinformatics 18(3): 496-497 (2002) |
| 125 | | Adalberto T. Castelo,
Wellington Martins,
Guang R. Gao:
TROLL-Tandem Repeat Occurrence Locator.
Bioinformatics 18(4): 634-636 (2002) |
| 124 | | Kevin B. Theobald,
Rishi Kumar,
Gagan Agrawal,
Gerd Heber,
Ruppa K. Thulasiram,
Guang R. Gao:
Implementation and evaluation of a communication intensive application on the EARTH multithreaded system.
Concurrency and Computation: Practice and Experience 14(3): 183-201 (2002) |
| 123 | EE | Ramaswamy Govindarajan,
Guang R. Gao,
Palash Desai:
Minimizing Buffer Requirements under Rate-Optimal Schedule in Regular Dataflow Networks.
VLSI Signal Processing 31(3): 207-229 (2002) |
| 2001 |
| 122 | EE | Artour Stoutchinin,
José Nelson Amaral,
Guang R. Gao,
James C. Dehnert,
Suneel Jain,
Alban Douillet:
Speculative Prefetching of Induction Pointers.
CC 2001: 289-303 |
| 121 | EE | Eduard Ayguadé,
Fredrik Dahlgren,
Christine Eisenbeis,
Roger Espasa,
Guang R. Gao,
Henk L. Muller,
Rizos Sakellariou,
André Seznec:
Topic 08+13: Instruction-Level Parallelism and Computer Architecture.
Euro-Par 2001: 385 |
| 120 | | Ruppa K. Thulasiram,
Lubomir Litov,
Hassan Nojumi,
Christopher T. Downing,
Guang R. Gao:
Multithreaded Algorithms for Pricing a Class of Complex Options.
IPDPS 2001: 18 |
| 119 | | Ramaswamy Govindarajan,
Hongbo Yang,
Chihong Zhang,
José Nelson Amaral,
Guang R. Gao:
Minimum Register Instruction Sequence Problem: Revisiting Optimal Code Generation for DAGs.
IPDPS 2001: 26 |
| 118 | | Guang R. Gao:
Bridging the gap between ISA compilers and silicon compilers a challenge for future SoC design.
ISSS 2001: 93 |
| 117 | | Wolfgang Rosenstiel,
Brian Bailey,
Masahiro Fujita,
Guang R. Gao,
Rajesh K. Gupta,
Preeti Ranjan Panda:
New design paradigms.
ISSS 2001: 94 |
| 116 | EE | W. S. Martins,
Juan del Cuvillo,
F. J. Useche,
Kevin B. Theobald,
Guang R. Gao:
A Multithreaded Parallel Implementation of a Dynamic Programming Algorithm for Sequence Comparison.
Pacific Symposium on Biocomputing 2001: 311-322 |
| 115 | | José Nelson Amaral,
Wen-Yen Lin,
Jean-Luc Gaudiot,
Guang R. Gao:
Exploiting Locality in Single Assignment Data Structures Updated Through Split-Phase Transactions.
Cluster Computing 4(4): 281-293 (2001) |
| 114 | | Prasad Kakulavarapu,
Olivier Maquelin,
José Nelson Amaral,
Guang R. Gao:
Dynamic Load Balancers for a Multithreaded Multiprocessor System.
Parallel Processing Letters 11(1): 169-184 (2001) |
| 2000 |
| 113 | EE | Ramaswamy Govindarajan,
Erik R. Altman,
Guang R. Gao:
A Theory for Software-Hardware Co-Scheduling for ASIPs and Embedded Processors.
ASAP 2000: 329-338 |
| 112 | EE | Kevin B. Theobald,
Rishi Kumar,
Gagan Agrawal,
Gerd Heber,
Ruppa K. Thulasiram,
Guang R. Gao:
Developing a Communication Intensive Application on the EARTH Multithreaded Architecture (Distinguished Paper).
Euro-Par 2000: 625-637 |
| 111 | EE | Gary M. Zoppetti,
Gagan Agrawal,
Lori L. Pollock,
José Nelson Amaral,
Xinan Tang,
Guang R. Gao:
Automatic compiler techniques for thread coarsening for multithreaded architectures.
ICS 2000: 306-315 |
| 110 | EE | Wen-Yen Lin,
Jean-Luc Gaudiot,
José Nelson Amaral,
Guang R. Gao:
Caching Single-Assignment Structures to Build a Robust Fine-Grain Multi-Threading System.
IPDPS 2000: 589-594 |
| 109 | EE | Bruce Carter,
Chuin-Shan Chen,
L. Paul Chew,
Nikos Chrisochoides,
Guang R. Gao,
Gerd Heber,
Anthony R. Ingraffea,
Roland Krause,
Chris Myers,
Démian Nave,
Keshav Pingali,
Paul Stodghill,
Stephen A. Vavasis,
Paul A. Wawrzynek:
Parallel FEM Simulation of Crack Propagation - Challenges, Status, and Perspectives.
IPDPS Workshops 2000: 443-449 |
| 108 | EE | José Nelson Amaral,
Guang R. Gao,
Erturk Dogan Kocalar,
Patrick O'Neill,
Xinan Tang:
Design and Implementation of an Efficient Thread Partitioning Algorithm.
ISHPC 2000: 252-259 |
| 107 | | Ruppa K. Thulasiram,
Christopher T. Downing,
Guang R. Gao:
Recursive and Iterative Multithreaded Algorithms for Pricing American Securities.
PDPTA 2000 |
| 106 | EE | Kevin B. Theobald,
Gagan Agrawal,
Rishi Kumar,
Gerd Heber,
Guang R. Gao,
Paul Stodghill,
Keshav Pingali:
Landing CG on EARTH: A Case Study of Fine-Grained Multithreading on an Evolutionary Path.
SC 2000 |
| 105 | EE | Parimala Thulasiraman,
Kevin B. Theobald,
Ashfaq A. Khokhar,
Guang R. Gao:
Multithreaded algorithms for the fast Fourier transform.
SPAA 2000: 176-185 |
| 104 | | Gerd Heber,
Rupak Biswas,
Guang R. Gao:
Self-Avoiding Walks over Adaptive Unstructured Grids.
Concurrency - Practice and Experience 12(2-3): 85-109 (2000) |
| 103 | EE | Guang R. Gao,
Vivek Sarkar:
Location Consistency-A New Memory Model and Cache Consistency Protocol.
IEEE Trans. Computers 49(8): 798-813 (2000) |
| 102 | | Ramaswamy Govindarajan,
N. S. S. Narasimha Rao,
Erik R. Altman,
Guang R. Gao:
Enhanced Co-Scheduling: A Software Pipelining Method Using Modulo-Scheduled Pipeline Theory.
International Journal of Parallel Programming 28(1): 1-46 (2000) |
| 1999 |
| 101 | | Chihong Zhang,
Ramaswamy Govindarajan,
Sean Ryan,
Guang R. Gao:
Efficient State-Diagram Construction Methods for Software Pipelining.
CC 1999: 153-167 |
| 100 | EE | Dean M. Tullsen,
Guang R. Gao:
Multithreaded Execution Architecture and Compilation.
HPCA 1999: 321 |
| 99 | EE | Gerd Heber,
Guang R. Gao,
Rupak Biswas:
A New Approach to Parallel Dynamic Partitioning for Adaptive Unstructured Meshes.
IPPS/SPDP 1999: 360-364 |
| 98 | EE | Ashfaq A. Khokhar,
Gerd Heber,
Parimala Thulasiraman,
Guang R. Gao:
Load Adaptive Algorithms and Implementations for the 2D Discrete Wavelet Transform on Fine-Grain Multithreaded Architectures.
IPPS/SPDP 1999: 458-462 |
| 97 | | Guang R. Gao:
From EARTH to HTMT: An Evolution of a Multiheaded Architecture Model (Abstract).
IPPS/SPDP Workshops 1999: 1025 |
| 96 | | Shigeru Kusakabe,
Kentaro Inenaga,
Makoto Amamiya,
Xinan Tang,
Andrés Márquez,
Guang R. Gao:
Implementing a Non-Strict Functional Programming Language on a Threaded Architecture.
IPPS/SPDP Workshops 1999: 138-152 |
| 95 | | Gerd Heber,
Rupak Biswas,
Guang R. Gao:
Self-Avoiding Walks over Adaptive Unstructured Grids.
IPPS/SPDP Workshops 1999: 968-977 |
| 94 | | Sean Ryan,
José Nelson Amaral,
Guang R. Gao,
Zachary Ruiz,
Andrés Márquez,
Kevin B. Theobald:
Coping with very High Latencies in Petaflop Computer Systems.
ISHPC 1999: 71-82 |
| 93 | EE | Ramaswamy Govindarajan,
Chihong Zhang,
Guang R. Gao:
Minimum Register Instruction Scheduling: A New Approach for Dynamic Instruction Issue Processors.
LCPC 1999: 70-84 |
| 92 | | Gerd Heber,
Rupak Biswas,
Guang R. Gao:
Self-Avoiding Walks Over Adaptive Triangular Grids.
PPSC 1999 |
| 91 | | Xinan Tang,
Guang R. Gao:
Automatically Partitioning Threads for Multithreaded Architectures.
J. Parallel Distrib. Comput. 58(2): 159-189 (1999) |
| 90 | | Walid A. Najjar,
Edward A. Lee,
Guang R. Gao:
Advances in the dataflow computational model.
Parallel Computing 25(13-14): 1907-1929 (1999) |
| 1998 |
| 89 | | Sylvain Lelait,
Guang R. Gao,
Christine Eisenbeis:
A New Fast Algorithm for Optimal Register Allocation in Modulo Scheduled Loops.
CC 1998: 204-218 |
| 88 | EE | Darren Erik Vengroff,
Guang R. Gao:
Partial Sampling with Reverse State Reconstruction: A New Technique for Branch Predictor Performance Estimation.
HPCA 1998: 342-351 |
| 87 | EE | Xinan Tang,
Guang R. Gao:
Automatically Partitioning Threads Based on Remote Paths.
ICPADS 1998: 632-639 |
| 86 | EE | Ramaswamy Govindarajan,
N. S. S. Narasimha Rao,
Erik R. Altman,
Guang R. Gao:
An Enhanced Co-Scheduling Method Using Reduced MS-State Diagrams.
IPPS/SPDP 1998: 168-175 |
| 85 | | Gerd Heber,
Rupak Biswas,
Parimala Thulasiraman,
Guang R. Gao:
Using Multithreading for the Automatic Load Balancing of Adaptive Finite Element Meshes.
IRREGULAR 1998: 132-143 |
| 84 | EE | Xinan Tang,
Guang R. Gao:
How "Hard" is Thread Partitioning and How "Bad" is a List Scheduling Based Partitioning Algorithm?
SPAA 1998: 130-139 |
| 83 | EE | Vugranam C. Sreedhar,
Guang R. Gao,
Yong-Fong Lee:
A New Framework for Elimination-Based Data Flow Analysis Using DJ Graphs.
ACM Trans. Program. Lang. Syst. 20(2): 388-435 (1998) |
| 82 | | Erik R. Altman,
Guang R. Gao:
Optimal Modulo Scheduling Through Enumeration.
International Journal of Parallel Programming 26(2): 313-344 (1998) |
| 81 | | Erik R. Altman,
Ramaswamy Govindarajan,
Guang R. Gao:
A Unified Framework for Instruction Scheduling and Mapping for Function Units with Structural Hazards.
J. Parallel Distrib. Comput. 49(2): 259-293 (1998) |
| 1997 |
| 80 | | Maria-Dana Tarlescu,
Kevin B. Theobald,
Guang R. Gao:
Elastic History Buffer: A Low-Cost Method to Improve Branch Prediction Accuracy.
ICCD 1997: 82-87 |
| 79 | EE | Xinan Tang,
Rakesh Ghiya,
Laurie J. Hendren,
Guang R. Gao:
Heap Analysis and Optimizations for Threaded Programs.
IEEE PACT 1997: 14-25 |
| 78 | EE | Rad Silvera,
Jian Wang,
Ramaswamy Govindarajan,
Guang R. Gao:
A Register Pressure Sensitive Instruction Scheduler for Dynamic Issue Processors.
IEEE PACT 1997: 78-89 |
| 77 | EE | Shashank S. Nemawarkar,
Guang R. Gao:
Latency Tolerance: A Metric for Performance Analysis of Multithreaded Architectures.
IPPS 1997: 227-232 |
| 76 | | Guang R. Gao,
Vivek Sarkar:
On the Importance of an End-To-End View of Memory Consistency in Future Computer Systems.
ISHPC 1997: 30-41 |
| 75 | | Angela Sodan,
Guang R. Gao,
Olivier Maquelin,
Jens-Uwe Schultz,
Xinmin Tian:
Experiences with Non-numeric Applications on Multithreaded Architectures.
PPOPP 1997: 124-135 |
| 74 | EE | Xinan Tang,
Jing Wang,
Kevin B. Theobald,
Guang R. Gao:
Thread Partitioning and Scheduling Based on Cost Model.
SPAA 1997: 272-281 |
| 73 | EE | Vugranam C. Sreedhar,
Guang R. Gao,
Yong-Fong Lee:
Incremental Computation of Dominator Trees.
ACM Trans. Program. Lang. Syst. 19(2): 239-252 (1997) |
| 1996 |
| 72 | EE | Xinmin Tian,
Shashank S. Nemawarkar,
Guang R. Gao,
Herbert H. J. Hum:
Data locality sensitivity of multithreaded computations on a distributed-memory multiprocessor.
CASCON 1996: 37 |
| 71 | | Jian Wang,
Guang R. Gao:
Pipelining-Dovetailing: A Transformation to Enhance Software Pipelining for Nested Loops.
CC 1996: 1-17 |
| 70 | | Erik R. Altman,
Guang R. Gao:
Optimal Software Pipelining Through Enumeration of Schedules.
Euro-Par, Vol. II 1996: 833-840 |
| 69 | EE | Ramaswamy Govindarajan,
Erik R. Altman,
Guang R. Gao:
Co-Scheduling Hardware and Software Pipelines.
HPCA 1996: 52-61 |
| 68 | EE | Olivier Maquelin,
Guang R. Gao,
Herbert H. J. Hum,
Kevin B. Theobald,
Xinmin Tian:
Polling Watchdog: Combining Polling and Interrupts for Efficient Message Handling.
ISCA 1996: 179-188 |
| 67 | | Vivek Sarkar,
Guang R. Gao,
Shaohua Han:
Locality Analysis for Distributed Shared-Memory Multiprocessors.
LCPC 1996: 20-40 |
| 66 | | Shashank S. Nemawarkar,
Guang R. Gao:
Measurement and Modeling of EARTH-MANNA Multithreaded Architecture.
MASCOTS 1996: 109-114 |
| 65 | | John C. Ruttenberg,
Guang R. Gao,
Woody Lichtenstein,
Artour Stoutchinin:
Software Pipelining Showdown: Optimal vs. Heuristic Methods in a Production Compiler.
PLDI 1996: 1-11 |
| 64 | | Vugranam C. Sreedhar,
Guang R. Gao,
Yong-Fong Lee:
A New Framework for Exhaustive and Incremental Data Flow Analysis Using DJ Graphs.
PLDI 1996: 278-290 |
| 63 | EE | Vugranam C. Sreedhar,
Guang R. Gao,
Yong-Fong Lee:
Identifying Loops Using DJ Graphs.
ACM Trans. Program. Lang. Syst. 18(6): 649-658 (1996) |
| 62 | EE | Ramaswamy Govindarajan,
Erik R. Altman,
Guang R. Gao:
A Framework for Resource-Constrained Rate-Optimal Software Pipelining.
IEEE Trans. Parallel Distrib. Syst. 7(11): 1133-1149 (1996) |
| 1995 |
| 61 | | Olivier Maquelin,
Herbert H. J. Hum,
Guang R. Gao:
Costs and Benefits of Multithreading with Off-the-Shelf RISC Processors.
Euro-Par 1995: 117-128 |
| 60 | EE | Qi Ning,
Vincent Van Dongen,
Guang R. Gao:
Automatic data and computation decomposition for distributed memory machines.
HICSS (2) 1995: 103-112 |
| 59 | | Kevin B. Theobald,
Herbert H. J. Hum,
Guang R. Gao:
A Design Frame for Hybrid Access Caches.
HPCA 1995: 144-153 |
| 58 | | Guang R. Gao,
Vivek Sarkar:
Location Consistency: Stepping Beyond the Memory Coherence Barrier.
ICPP (2) 1995: 73-76 |
| 57 | | Vugranam C. Sreedhar,
Guang R. Gao,
Yong-Fong Lee:
Incremental Computation of Dominator Trees.
Intermediate Representations Workshop 1995: 1-12 |
| 56 | EE | Nasser Elmasri,
Herbert H. J. Hum,
Guang R. Gao:
The Threaded Communication Library: Preliminary Experiences on a Multiprocessor with Dual-Processor Nodes.
International Conference on Supercomputing 1995: 195-199 |
| 55 | | Erik R. Altman,
Guang R. Gao,
Ramaswamy Govindarajan:
An Experimental Study of an ILP-based Exact Solution Method for Software Pipelining.
LCPC 1995: 16-30 |
| 54 | EE | Luis A. Lozano,
Guang R. Gao:
Exploiting short-lived variables in superscalar processors.
MICRO 1995: 292-302 |
| 53 | | Erik R. Altman,
Ramaswamy Govindarajan,
Guang R. Gao:
Scheduling and Mapping: Software Pipelining in the Presence of Structural Hazards.
PLDI 1995: 139-150 |
| 52 | | Vugranam C. Sreedhar,
Guang R. Gao:
A Linear Time Algorithm for Placing phi-nodes.
POPL 1995: 62-73 |
| 51 | | Eshrat Arjomandi,
William G. O'Farrell,
Ivan Kalas,
Gita Koblents,
Frank Ch. Eigler,
Guang R. Gao:
ABC++: Concurrency by Inheritance in C++.
IBM Systems Journal 34(1): 120-137 (1995) |
| 50 | EE | Vugranam C. Sreedhar,
Guang R. Gao:
Computing phi-nodes in linear time using DJ graphs.
J. Prog. Lang. 3(4): (1995) |
| 49 | | Qi Ning,
Guang R. Gao:
Automatic Data and Computation Decomposition for Distributed-Memory Machines.
Parallel Processing Letters 5: 539-550 (1995) |
| 48 | EE | Ramaswamy Govindarajan,
Guang R. Gao:
Rate-optimal schedule for multi-rate DSP computations.
VLSI Signal Processing 9(3): 211-232 (1995) |
| 1994 |
| 47 | | Michel Cosnard,
Guang R. Gao,
Gabriel M. Silberman:
Parallel Architectures and Compilation Techniques, Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques, PACT'94, Montréal, Canada, 24-26 August, 1994
North-Holland 1994 |
| 46 | EE | Gilles Hurteau,
Vincent Van Dongen,
Guang R. Gao:
EPPP - an integrated environment for portable parallel programming.
CASCON 1994: 31 |
| 45 | EE | Ivan Kalas,
Eshrat Arjomandi,
Guang R. Gao,
William G. O'Farrell:
FTL: a multithreaded environment for parallel computation.
CASCON 1994: 33 |
| 44 | EE | Qi Ning,
Vincent Van Dongen,
Guang R. Gao:
Automatic decomposition in EPPP compiler.
CASCON 1994: 49 |
| 43 | EE | Vincent Van Dongen,
Christophe Bonello,
Guang R. Gao:
Data parallelism with high performance C.
CASCON 1994: 69 |
| 42 | | Ramaswamy Govindarajan,
Erik R. Altman,
Guang R. Gao:
A Framework for Resource-Constrained Rate-Optimal Software Pipelining.
CONPAR 1994: 640-651 |
| 41 | | Guoning Liao,
Erik R. Altman,
Vinod K. Agarwal,
Guang R. Gao:
A Comparative Study of Multiprocessor List Scheduling Heuristics.
HICSS (1) 1994: 68-77 |
| 40 | | Herbert H. J. Hum,
Kevin B. Theobald,
Guang R. Gao:
Building Multithreaded Architectures with Off-the-Shelf Microprocessors.
IPPS 1994: 288-294 |
| 39 | EE | Ramaswamy Govindarajan,
Erik R. Altman,
Guang R. Gao:
Minimizing register requirements under resource-constrained rate-optimal software pipelining.
MICRO 1994: 85-94 |
| 38 | | Shashank S. Nemawarkar,
Ramaswamy Govindarajan,
Guang R. Gao,
Vinod K. Agarwal:
Performance of Interconnection Network in Multithreaded Architectures.
PARLE 1994: 823-826 |
| 1993 |
| 37 | | Erik R. Altman,
Vinod K. Agarwal,
Guang R. Gao:
A Novel Methodology Using Genetic Algorithms for the Design of Caches and Cache Replacement Policy.
ICGA 1993: 392-399 |
| 36 | EE | Kevin B. Theobald,
Guang R. Gao,
Laurie J. Hendren:
Speculative Execution and Branch Prediction on Parallel Machines.
International Conference on Supercomputing 1993: 77-86 |
| 35 | | Guang R. Gao,
Qi Ning,
Vincent Van Dongen:
Extending Software Pipelining Techniques for Scheduling Nested Loops.
LCPC 1993: 340-357 |
| 34 | | Robert Kim Yates,
Guang R. Gao:
A Kahn Principle for Networks of Nonmonotonic Real-time Processes.
PARLE 1993: 209-227 |
| 33 | | Qi Ning,
Guang R. Gao:
A Novel Framework of Register Allocation for Software Pipelining.
POPL 1993: 29-42 |
| 32 | | Shashank S. Nemawarkar,
Ramaswamy Govindarajan,
Guang R. Gao,
Vinod K. Agarwal:
Analysis of Multithreaded Multiprocessors with Distributed Shared Memory.
SPDP 1993: 114-121 |
| 31 | | Laurie J. Hendren,
Guang R. Gao:
Designing Programming Languages for the Analyzability of Pointer Data Structures.
Comput. Lang. 19(2): 119-134 (1993) |
| 30 | | Guang R. Gao,
Jean-Luc Gaudiot,
Lubomir Bic:
Special Issue on DataFlow and Multithreaded Architectures - Guest Editors' Introduction.
J. Parallel Distrib. Comput. 18(3): 271-272 (1993) |
| 29 | | Guang R. Gao:
An Efficient Hybrid Dataflow Architecture Modle.
J. Parallel Distrib. Comput. 19(4): 293-307 (1993) |
| 1992 |
| 28 | | Laurie J. Hendren,
Guang R. Gao,
Erik R. Altman,
Chandrika Mukerji:
A Register Allocation Framework Based on Hierarchical Cyclic Interval Graphs.
CC 1992: 176-191 |
| 27 | | Vincent Van Dongen,
Guang R. Gao,
Qi Ning:
A Polynomial Time Method for Optimal Software Pipelining.
CONPAR 1992: 613-624 |
| 26 | | Shashank S. Nemawarkar,
Ramaswamy Govindarajan,
Guang R. Gao,
Vinod K. Agarwal:
Performance Evaluation of Latency Tolerant Architectures.
ICCI 1992: 183-186 |
| 25 | EE | Laurie J. Hendren,
Guang R. Gao:
Designing programming languages for analyzability: a fresh look at pointer data structures.
ICCL 1992: 242-251 |
| 24 | | Jean-Marc Monti,
Guang R. Gao:
Efficient Interprocessor Synchronization/Communication on a Dataflow Multiprocessor Architecture.
ICPP (1) 1992: 220-223 |
| 23 | | Guang R. Gao,
R. Olsen,
Vivek Sarkar,
Radhika Thekkath:
Collective Loop Fusion for Array Contraction.
LCPC 1992: 281-295 |
| 22 | | Laurie J. Hendren,
C. Donawa,
Maryam Emami,
Guang R. Gao,
Justiani,
B. Sridharan:
Designing the McCAT Compiler Based on a Family of Structured Intermediate Representations.
LCPC 1992: 406-420 |
| 21 | EE | Kevin B. Theobald,
Guang R. Gao,
Laurie J. Hendren:
On the limits of program parallelism and its smoothability.
MICRO 1992: 10-19 |
| 20 | | Qi Ning,
Guang R. Gao:
Minimizing Loop Storage Allocation for An Argument-Fetching Dataflow Architecture Model.
PARLE 1992: 585-600 |
| 1991 |
| 19 | EE | Guang R. Gao,
Yue-Bong Wong,
Qi Ning:
A timed Petri-net model for fine-grain loop scheduling.
CASCON 1991: 395-415 |
| 18 | EE | Vivek Sarkar,
Guang R. Gao:
Optimization of array accesses by collective loop transformations.
ICS 1991: 194-205 |
| 17 | | Guang R. Gao,
Qi Ning:
Loop Storage Optimization for Dataflow Machines.
LCPC 1991: 359-373 |
| 16 | | Herbert H. J. Hum,
Guang R. Gao:
A Novel High-Speed Memory Organization for Fine-Grain Multi-Thread Computing.
PARLE (1) 1991: 34-51 |
| 15 | | Guang R. Gao,
Herbert H. J. Hum,
Jean-Marc Monti:
Towards an Efficient Hybrid Dataflow Architecture Model.
PARLE (1) 1991: 355-371 |
| 14 | | Guang R. Gao,
Yue-Bong Wong,
Qi Ning:
A Timed Petri-Net Model for Fine-Grain Loop Scheduling.
PLDI 1991: 204-218 |
| 13 | EE | Kevin B. Theobald,
Guang R. Gao:
An efficient parallel algorithm for all pairs examination.
SC 1991: 742-753 |
| 1990 |
| 12 | | Guang R. Gao,
Herbert H. J. Hum,
Yue-Bong Wong:
An Efficient Scheme for Fine-Grain Software Pipelining.
CONPAR 1990: 709-720 |
| 11 | EE | Guang R. Gao,
Herbert H. J. Hum,
Yue-Bong Wong:
Towards efficient fine-grain software pipelining.
ICS 1990: 369-379 |
| 10 | EE | Guang R. Gao,
Robert Kim Yates,
Jack B. Dennis,
Lenor M. R. Mullin:
A strict monolithic array constructor.
SPDP 1990: 596-603 |
| 9 | | Guang R. Gao:
Exploiting fine-grain parallelism on dataflow architectures.
Parallel Computing 13(3): 309-320 (1990) |
| 1989 |
| 8 | | Guang R. Gao:
Algorithmic Aspects of Balancing Techniques for Pipelined Data Flow Code Generation.
J. Parallel Distrib. Comput. 6(1): 39-61 (1989) |
| 1988 |
| 7 | | Guang R. Gao,
René Tio,
Herbert H. J. Hum:
Design of an Efficient Dataflow Architecture without Data Flow.
FGCS 1988: 861-868 |
| 6 | EE | Jack B. Dennis,
Guang R. Gao:
An efficient pipelined dataflow processor architecture.
SC 1988: 368-373 |
| 1987 |
| 5 | EE | Guang R. Gao:
A stability classification method and its application to pipelined solution of linear recurrences.
Parallel Computing 4(3): 305-321 (1987) |
| 1986 |
| 4 | | Guang R. Gao:
A Pipelined Solution Method of Tridiagonal Linear Equation Systems.
ICPP 1986: 84-91 |
| 3 | | Guang R. Gao:
A Maximally Pipelined Tridiagonal Linear Equation Solver.
J. Parallel Distrib. Comput. 3(2): 215-235 (1986) |
| 1984 |
| 2 | | Jack B. Dennis,
Guang R. Gao,
Kenneth W. Todd:
Modeling the Weather with a Data Flow Supercomputer.
IEEE Trans. Computers 33(7): 592-603 (1984) |
| 1983 |
| 1 | | Jack B. Dennis,
Guang R. Gao:
Maximum Pipelining of Array Operations on Static Data Flow Machine.
ICPP 1983: 331-334 |