2009 |
114 | EE | Hassan Salamy,
J. Ramanujam:
A Framework for Task Scheduling and Memory Partitioning for Multi-Processor System-on-Chip.
HiPEAC 2009: 263-277 |
113 | EE | Muthu Manikandan Baskaran,
Nagavijayalakshmi Vydyanathan,
Uday Bondhugula,
J. Ramanujam,
Atanas Rountev,
P. Sadayappan:
Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processors.
PPOPP 2009: 219-228 |
2008 |
112 | EE | Uday Bondhugula,
Muthu Manikandan Baskaran,
Sriram Krishnamoorthy,
J. Ramanujam,
Atanas Rountev,
P. Sadayappan:
Automatic Transformations for Communication-Minimized Parallelization and Locality Optimization in the Polyhedral Model.
CC 2008: 132-146 |
111 | EE | Hassan Salamy,
J. Ramanujam:
Storage optimization through code size reduction for digital signal processors.
ESTImedia 2008: 107-112 |
110 | EE | Hassan Salamy,
J. Ramanujam:
Optimal address register allocation for arrays in DSP applications.
ESTImedia 2008: 67-72 |
109 | EE | Muthu Manikandan Baskaran,
Uday Bondhugula,
Sriram Krishnamoorthy,
J. Ramanujam,
Atanas Rountev,
P. Sadayappan:
A compiler framework for optimization of affine loop nests for gpgpus.
ICS 2008: 225-234 |
108 | EE | Uday Bondhugula,
Muthu Manikandan Baskaran,
Albert Hartono,
Sriram Krishnamoorthy,
J. Ramanujam,
Atanas Rountev,
P. Sadayappan:
Towards effective automatic parallelization for multicore systems.
IPDPS 2008: 1-5 |
107 | EE | Uday Bondhugula,
Albert Hartono,
J. Ramanujam,
P. Sadayappan:
A practical automatic polyhedral parallelizer and locality optimizer.
PLDI 2008: 101-113 |
106 | EE | Muthu Manikandan Baskaran,
Uday Bondhugula,
Sriram Krishnamoorthy,
J. Ramanujam,
Atanas Rountev,
P. Sadayappan:
Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories.
PPOPP 2008: 1-10 |
2007 |
105 | EE | Jinpyo Hong,
J. Ramanujam:
Memory Offset Assignment for DSPs.
ICESS 2007: 80-87 |
104 | EE | Sriram Krishnamoorthy,
Muthu Manikandan Baskaran,
Uday Bondhugula,
J. Ramanujam,
Atanas Rountev,
P. Sadayappan:
Effective automatic parallelization of stencil computations.
PLDI 2007: 235-244 |
103 | EE | Uday Bondhugula,
J. Ramanujam,
P. Sadayappan:
Automatic mapping of nested loops to FPGAS.
PPOPP 2007: 101-111 |
102 | EE | Sai Pinnepalli,
Jinpyo Hong,
J. Ramanujam,
Doris L. Carver:
Code Size Optimization for Embedded Processors using Commutative Transformations.
RTCSA 2007: 409-416 |
101 | EE | Xiaoyang Gao,
Sriram Krishnamoorthy,
Swarup Kumar Sahoo,
Chi-Chung Lam,
Gerald Baumgartner,
J. Ramanujam,
P. Sadayappan:
Efficient search-space pruning for integrated fusion and tiling transformations.
Concurrency and Computation: Practice and Experience 19(18): 2425-2443 (2007) |
2006 |
100 | | Eduard Ayguadé,
Gerald Baumgartner,
J. Ramanujam,
P. Sadayappan:
Languages and Compilers for Parallel Computing, 18th International Workshop, LCPC 2005, Hawthorne, NY, USA, October 20-22, 2005, Revised Selected Papers
Springer 2006 |
99 | EE | A. Allam,
J. Ramanujam,
Gerald Baumgartner,
P. Sadayappan:
Memory minimization for tensor contractions using integer linear programming.
IPDPS 2006 |
98 | EE | Albert Hartono,
Qingda Lu,
Xiaoyang Gao,
Sriram Krishnamoorthy,
Marcel Nooijen,
Gerald Baumgartner,
David E. Bernholdt,
Venkatesh Choppella,
Russell M. Pitzer,
J. Ramanujam,
Atanas Rountev,
P. Sadayappan:
Identifying Cost-Effective Common Subexpressions to Reduce Operation Count in Tensor Contraction Evaluations.
International Conference on Computational Science (1) 2006: 267-275 |
97 | EE | Hassan Salamy,
J. Ramanujam:
An Effective Heuristic for Simple Offset Assignment with Variable Coalescing.
LCPC 2006: 158-172 |
96 | EE | Mahmut T. Kandemir,
J. Ramanujam,
Ugur Sezer:
Improving the energy behavior of block buffering using compiler optimizations.
ACM Trans. Design Autom. Electr. Syst. 11(1): 228-250 (2006) |
95 | EE | Guilin Chen,
Mahmut T. Kandemir,
Mary Jane Irwin,
J. Ramanujam:
Reducing code size through address register assignment.
ACM Trans. Embedded Comput. Syst. 5(1): 225-258 (2006) |
94 | EE | J. Ramanujam,
Jinpyo Hong,
Mahmut T. Kandemir,
Amit Narayan,
A. Agarwal:
Estimating and reducing the memory requirements of signal processing codes for embedded systems.
IEEE Transactions on Signal Processing 54(1): 286-294 (2006) |
93 | EE | Sandhya Krishnan,
Sriram Krishnamoorthy,
Gerald Baumgartner,
Chi-Chung Lam,
J. Ramanujam,
P. Sadayappan,
Venkatesh Choppella:
Efficient synthesis of out-of-core algorithms using a nonlinear optimization solver.
J. Parallel Distrib. Comput. 66(5): 659-673 (2006) |
2005 |
92 | EE | Albert Hartono,
Alexander Sibiryakov,
Marcel Nooijen,
Gerald Baumgartner,
David E. Bernholdt,
So Hirata,
Chi-Chung Lam,
Russell M. Pitzer,
J. Ramanujam,
P. Sadayappan:
Automated Operation Minimization of Tensor Contraction Expressions in Electronic Structure Calculations.
International Conference on Computational Science (1) 2005: 155-164 |
91 | EE | Xiaoyang Gao,
Sriram Krishnamoorthy,
Swarup Kumar Sahoo,
Chi-Chung Lam,
Gerald Baumgartner,
J. Ramanujam,
P. Sadayappan:
Efficient Search-Space Pruning for Integrated Fusion and Tiling Transformations.
LCPC 2005: 215-229 |
90 | EE | Xiaoyang Gao,
Swarup Kumar Sahoo,
Chi-Chung Lam,
J. Ramanujam,
Qingda Lu,
Gerald Baumgartner,
P. Sadayappan:
Performance modeling and optimization of parallel out-of-core tensor contractions.
PPOPP 2005: 266-276 |
2004 |
89 | EE | Sandhya Krishnan,
Sriram Krishnamoorthy,
Gerald Baumgartner,
Chi-Chung Lam,
J. Ramanujam,
P. Sadayappan,
Venkatesh Choppella:
Efficient Synthesis of Out-of-Core Algorithms Using a Nonlinear Optimization Solver.
IPDPS 2004 |
88 | EE | Qingda Lu,
Xiaoyang Gao,
Sriram Krishnamoorthy,
Gerald Baumgartner,
J. Ramanujam,
P. Sadayappan:
Empirical Performance-Model Driven Data Layout Optimization.
LCPC 2004: 72-86 |
87 | EE | Mahmut T. Kandemir,
J. Ramanujam,
Mary Jane Irwin,
Narayanan Vijaykrishnan,
Ismail Kadayif,
Amisha Parikh:
A compiler-based approach for dynamically managing scratch-pad memories in embedded systems.
IEEE Trans. on CAD of Integrated Circuits and Systems 23(2): 243-260 (2004) |
2003 |
86 | EE | Mahmut T. Kandemir,
Mary Jane Irwin,
Guilin Chen,
J. Ramanujam:
Address Register Assignment for Reducing Code Size.
CC 2003: 273-289 |
85 | EE | Sandhya Krishnan,
Sriram Krishnamoorthy,
Gerald Baumgartner,
Daniel Cociorva,
Chi-Chung Lam,
P. Sadayappan,
J. Ramanujam,
David E. Bernholdt,
Venkatesh Choppella:
Data Locality Optimization for Synthesis of Efficient Out-of-Core Algorithms.
HiPC 2003: 406-417 |
84 | EE | Daniel Cociorva,
Xiaoyang Gao,
Sandhya Krishnan,
Gerald Baumgartner,
Chi-Chung Lam,
P. Sadayappan,
J. Ramanujam:
Global Communication Optimization for Tensor Contraction Expressions under Memory Constraints.
IPDPS 2003: 37 |
83 | EE | Alina Bibireata,
Sandhya Krishnan,
Gerald Baumgartner,
Daniel Cociorva,
Chi-Chung Lam,
P. Sadayappan,
J. Ramanujam,
David E. Bernholdt,
Venkatesh Choppella:
Memory-Constrained Data Locality Optimization for Tensor Contractions.
LCPC 2003: 93-108 |
82 | EE | Mahmut T. Kandemir,
Alok N. Choudhary,
J. Ramanujam,
Prithviraj Banerjee:
Reducing False Sharing and Improving Spatial Locality in a Unified Compilation Framework.
IEEE Trans. Parallel Distrib. Syst. 14(4): 337-354 (2003) |
2002 |
81 | EE | Mahmut T. Kandemir,
J. Ramanujam,
Alok N. Choudhary:
Exploiting shared scratch pad memory space in embedded multiprocessor systems.
DAC 2002: 219-224 |
80 | EE | Gerald Baumgartner,
David E. Bernholdt,
Daniel Cociorva,
Chi-Chung Lam,
J. Ramanujam,
Robert J. Harrison,
Marcel Nooijen,
P. Sadayappan:
A Performance Optimization Framework for Compilation of Tensor Contraction Expressions into Parallel Programs.
IPDPS 2002 |
79 | EE | Daniel Cociorva,
Gerald Baumgartner,
Chi-Chung Lam,
P. Sadayappan,
J. Ramanujam:
Memory-Constrained Communication Minimization for a Class of Array Computations.
LCPC 2002: 1-15 |
78 | EE | Daniel Cociorva,
Gerald Baumgartner,
Chi-Chung Lam,
P. Sadayappan,
J. Ramanujam,
Marcel Nooijen,
David E. Bernholdt,
Robert J. Harrison:
Space-Time Trade-Off Optimization for a Class of Electronic Structure Calculations.
PLDI 2002: 177-186 |
77 | EE | Gerald Baumgartner,
David E. Bernholdt,
Daniel Cociorva,
Robert J. Harrison,
So Hirata,
Chi-Chung Lam,
Marcel Nooijen,
Russell M. Pitzer,
J. Ramanujam,
P. Sadayappan:
A high-level approach to synthesis of high-performance codes for quantum chemistry.
SC 2002: 1-10 |
76 | EE | J. Ramanujam,
Sandeep Deshpande,
Jinpyo Hong,
Mahmut T. Kandemir:
A Heuristic for Clock Selection in High-Level Synthesis.
VLSI Design 2002: 414-419 |
75 | EE | J. Ramanujam,
Satish Krishnamurthy,
Jinpyo Hong,
Mahmut T. Kandemir:
Address Code and Arithmetic Optimizations for Embedded Systems.
VLSI Design 2002: 619-624 |
74 | EE | N. E. Crosbie,
Mahmut T. Kandemir,
Ibrahim Kolcu,
J. Ramanujam,
Alok N. Choudhary:
Strategies for Improving Data Locality in Embedded Applications.
VLSI Design 2002: 631- |
73 | | J. Ramanujam:
Automatic Data Distribution.
The Compiler Design Handbook 2002: 409-460 |
72 | | Mahmut T. Kandemir,
Alok N. Choudhary,
J. Ramanujam:
An I/O-Conscious Tiling Strategy for Disk-Resident Data Sets.
The Journal of Supercomputing 21(3): 257-284 (2002) |
2001 |
71 | EE | J. Ramanujam:
Integer Lattice Based Methods for Local Address Generation for Block-Cyclic Distributions.
Compiler Optimizations for Scalable Parallel Systems Languages 2001: 597-648 |
70 | EE | J. Ramanujam,
Jinpyo Hong,
Mahmut T. Kandemir,
Amit Narayan:
Reducing Memory Requirements of Nested Loops for Embedded Systems.
DAC 2001: 359-364 |
69 | EE | Mahmut T. Kandemir,
J. Ramanujam,
Mary Jane Irwin,
Narayanan Vijaykrishnan,
Ismail Kadayif,
Amisha Parikh:
Dynamic Management of Scratch-Pad Memory Space.
DAC 2001: 690-695 |
68 | EE | Daniel Cociorva,
J. W. Wilkins,
Gerald Baumgartner,
P. Sadayappan,
J. Ramanujam,
Marcel Nooijen,
David E. Bernholdt,
Robert J. Harrison:
Towards Automatic Synthesis of High-Performance Codes for Electronic Structure Calculations: Data Locality Optimization.
HiPC 2001: 237-248 |
67 | EE | Daniel Cociorva,
J. W. Wilkins,
Chi-Chung Lam,
Gerald Baumgartner,
J. Ramanujam,
P. Sadayappan:
Loop optimization for a class of memory-constrained computations.
ICS 2001: 103-113 |
66 | EE | Mahmut T. Kandemir,
J. Ramanujam,
Ugur Sezer:
Compiler support for block buffering.
ISLPED 2001: 76-79 |
65 | | Ismail Kadayif,
Mahmut T. Kandemir,
Narayanan Vijaykrishnan,
Mary Jane Irwin,
J. Ramanujam:
Morphable Cache Architectures: Potential Benefits.
LCTES/OM 2001: 128-137 |
64 | EE | M. Narasimhan,
J. Ramanujam:
A fast approach to computing exact solutions to the resource-constrained scheduling problem.
ACM Trans. Design Autom. Electr. Syst. 6(4): 490-500 (2001) |
63 | EE | Mahmut T. Kandemir,
J. Ramanujam,
Alok N. Choudhary,
Prithviraj Banerjee:
A Layout-Conscious Iteration Space Transformation Technique.
IEEE Trans. Computers 50(12): 1321-1336 (2001) |
62 | EE | Mahmut T. Kandemir,
J. Ramanujam:
Data Relation Vectors: A New Abstraction for Data Optimizations.
IEEE Trans. Computers 50(8): 798-810 (2001) |
61 | EE | Mahmut T. Kandemir,
Prithviraj Banerjee,
Alok N. Choudhary,
J. Ramanujam,
Eduard Ayguadé:
Static and Dynamic Locality Optimizations Using Integer Linear Programming.
IEEE Trans. Parallel Distrib. Syst. 12(9): 922-941 (2001) |
60 | EE | Siddharth Rele,
Vipin Jain,
Santosh Pande,
J. Ramanujam:
Compact and efficient code generation through program restructuringon limited memory embedded DSPs.
IEEE Trans. on CAD of Integrated Circuits and Systems 20(4): 477-494 (2001) |
2000 |
59 | EE | M. Narasimhan,
J. Ramanujam:
On lower bounds for scheduling problems in high-level synthesis.
DAC 2000: 546-551 |
58 | EE | Sunil Atri,
J. Ramanujam,
Mahmut T. Kandemir:
Improving Offset Assignment on Embedded Processors Using Transformations.
HiPC 2000: 367-374 |
57 | EE | Mahmut T. Kandemir,
J. Ramanujam:
Data Relation Vectors: A New Abstraction for Data Optimizations.
IEEE PACT 2000: 227-236 |
56 | EE | Sunil Atri,
J. Ramanujam,
Mahmut T. Kandemir:
Improving Offset Assignment for Embedded Processors.
LCPC 2000: 158-172 |
55 | EE | Mahmut T. Kandemir,
Alok N. Choudhary,
Prithviraj Banerjee,
J. Ramanujam,
U. Nagaraj Shenoy:
Minimizing Data and Synchronization Costs in One-Way Communication.
IEEE Trans. Parallel Distrib. Syst. 11(12): 1232-1251 (2000) |
54 | EE | Mahmut T. Kandemir,
Alok N. Choudhary,
J. Ramanujam,
Meenakshi A. Kandaswamy:
A Unified Framework for Optimizing Locality, Parallelism, and Communication in Out-of-Core Computations.
IEEE Trans. Parallel Distrib. Syst. 11(7): 648-668 (2000) |
53 | | Mahmut T. Kandemir,
J. Ramanujam,
Alok N. Choudhary:
Compiler Algorithms for Optimizing Locality and Parallelism on Shared and Distributed-Memory Machines.
J. Parallel Distrib. Comput. 60(8): 924-965 (2000) |
1999 |
52 | EE | Mahmut T. Kandemir,
Alok N. Choudhary,
J. Ramanujam:
I/O-Conscious Tiling for Disk-Resident Data Sets.
Euro-Par 1999: 430-439 |
51 | | Mahmut T. Kandemir,
Alok N. Choudhary,
J. Ramanujam:
Restructuring I/O-Intensive Computations for Locality.
HPCN Europe 1999: 1097-1106 |
50 | EE | Mahmut T. Kandemir,
Alok N. Choudhary,
J. Ramanujam:
Compiler Optimizations for I/O-Intensive Computations.
ICPP 1999: 164-171 |
49 | EE | Mahmut T. Kandemir,
Alok N. Choudhary,
J. Ramanujam,
Prithviraj Banerjee:
A Framework for Interprocedural Locality Optimization Using Both Loop and Data Layout Transformations.
ICPP 1999: 95-102 |
48 | EE | Mahmut T. Kandemir,
Alok N. Choudhary,
J. Ramanujam,
Prithviraj Banerjee:
On Reducing False Sharing while Improving Locality on Shared Memory Multiprocessors.
IEEE PACT 1999: 203-211 |
47 | EE | Mahmut T. Kandemir,
Alok N. Choudhary,
J. Ramanujam,
Prithviraj Banerjee:
A Graph Based Framework to Detect Optimal Memory Layouts for Improving Data Locality.
IPPS/SPDP 1999: 738-743 |
46 | EE | Mahmut T. Kandemir,
Prithviraj Banerjee,
Alok N. Choudhary,
J. Ramanujam,
Eduard Ayguadé:
An integer linear programming approach for optimizing cache locality.
International Conference on Supercomputing 1999: 500-509 |
45 | EE | Vipin Jain,
Siddharth Rele,
Santosh Pande,
J. Ramanujam:
Code Restructuring for Improving Real Time Response through Code Speed, Size Trade-offs on Limited Memory Embedded DSPs.
LCPC 1999: 459-463 |
44 | | Mahmut T. Kandemir,
Alok N. Choudhary,
J. Ramanujam,
Prithviraj Banerjee:
Improving Locality Using a Graph-Based Technique for Detecting Memory Layouts of Arrays.
PPSC 1999 |
43 | EE | Mahmut T. Kandemir,
Prithviraj Banerjee,
Alok N. Choudhary,
J. Ramanujam,
U. Nagaraj Shenoy:
A global communication optimization technique based on data-flow analysis and linear algebra.
ACM Trans. Program. Lang. Syst. 21(6): 1251-1297 (1999) |
42 | EE | Mahmut T. Kandemir,
J. Ramanujam,
Alok N. Choudhary:
Improving Cache Locality by a Combination of Loop and Data Transformation.
IEEE Trans. Computers 48(2): 159-167 (1999) |
41 | EE | Mahmut T. Kandemir,
Alok N. Choudhary,
U. Nagaraj Shenoy,
Prithviraj Banerjee,
J. Ramanujam:
A Linear Algebra Framework for Automatic Determination of Optimal Data Layouts.
IEEE Trans. Parallel Distrib. Syst. 10(2): 115-135 (1999) |
40 | | Mahmut T. Kandemir,
Alok N. Choudhary,
J. Ramanujam,
Prithviraj Banerjee:
A Matrix-Based Approach to Global Locality Optimization.
J. Parallel Distrib. Comput. 58(2): 190-235 (1999) |
1998 |
39 | EE | Mahmut T. Kandemir,
Alok N. Choudhary,
J. Ramanujam,
U. Nagaraj Shenoy,
Prithviraj Banerjee:
Enhancing Spatial Locality via Data Layout Optimizations.
Euro-Par 1998: 422-434 |
38 | EE | M. Narasimhan,
J. Ramanujam:
Improving the computational performance of ILP-based problems.
ICCAD 1998: 593-596 |
37 | EE | Mahmut T. Kandemir,
U. Nagaraj Shenoy,
Prithviraj Banerjee,
J. Ramanujam,
Alok N. Choudhary:
Minimizing Data and Synchronization Costs in One-Way Communication.
ICPP 1998: 180-188 |
36 | EE | Mahmut T. Kandemir,
Alok N. Choudhary,
J. Ramanujam,
Prithviraj Banerjee:
A Matrix-Based Approach to the Global Locality Optimization Problem.
IEEE PACT 1998: 306-313 |
35 | EE | Mahmut T. Kandemir,
Prithviraj Banerjee,
Alok N. Choudhary,
J. Ramanujam,
U. Nagaraj Shenoy:
A Generalized Framework for Global Communication Optimization.
IPPS/SPDP 1998: 69-73 |
34 | EE | Mahmut T. Kandemir,
Alok N. Choudhary,
U. Nagaraj Shenoy,
Prithviraj Banerjee,
J. Ramanujam:
A Hyperplane Based Approach for Optimizing Spatial Locality in Loop Nests.
International Conference on Supercomputing 1998: 69-76 |
33 | EE | Mahmut T. Kandemir,
J. Ramanujam,
Alok N. Choudhary,
Prithviraj Banerjee:
A Loop Transformation Algorithm Based on Explicit Data Layout Representation for Optimizing Locality.
LCPC 1998: 34-50 |
32 | EE | Mahmut T. Kandemir,
Alok N. Choudhary,
J. Ramanujam:
Improving Locality in Out-of-Core Computations Using Data Layout Transformations.
LCR 1998: 359-366 |
31 | EE | Mahmut T. Kandemir,
Alok N. Choudhary,
J. Ramanujam,
Prithviraj Banerjee:
Improving Locality Using Loop and Data Transformations in an Integrated Framework.
MICRO 1998: 285-297 |
30 | | P. Sadayappan,
Fikret Erçal,
J. Ramanujam:
Partitioning Graphs on Message-Passing Machines by Pairwise Mincut.
Inf. Sci. 111(1-4): 223-237 (1998) |
29 | EE | Mahmut T. Kandemir,
Alok N. Choudhary,
J. Ramanujam,
Meenakshi A. Kandaswamy:
Locality Optimization Algorithms for Compilation of Out-of-Core Codes.
J. Inf. Sci. Eng. 14(1): 107-138 (1998) |
28 | | Mahmut T. Kandemir,
Alok N. Choudhary,
J. Ramanujam,
Rajesh Bordawekar:
Compilation Techniques for Out-of-Core Parallel Computations.
Parallel Computing 24(3-4): 597-628 (1998) |
1997 |
27 | | Mahmut T. Kandemir,
J. Ramanujam,
Alok N. Choudhary:
Optimization of Out-of-Core Computations Using Chain Vectors.
Euro-Par 1997: 601-608 |
26 | EE | Mahmut T. Kandemir,
J. Ramanujam,
Alok N. Choudhary:
Improving the Performance of Out-of-Core Computations.
ICPP 1997: 128-136 |
25 | EE | Mahmut T. Kandemir,
J. Ramanujam,
Alok N. Choudhary:
Compiler Algorithms for Optimizing Locality and Parallelism on Shared and Distributed Memory Machines.
IEEE PACT 1997: 236- |
24 | EE | Mahmut T. Kandemir,
Alok N. Choudhary,
J. Ramanujam,
Meenakshi A. Kandaswamy:
A Unified Compiler Algorithm for Optimizing Locality, Parallelism and Communication in Out-of-core Computations.
IOPADS 1997: 79-92 |
23 | EE | Mahmut T. Kandemir,
J. Ramanujam,
Alok N. Choudhary:
A Compiler Algorithm for Optimizing Locality in Loop Nests.
International Conference on Supercomputing 1997: 269-276 |
22 | | J. Ramanujam,
Swaroop Dutta,
Arun Venkatachar:
Code Generation for Complex Subscripts in Data-Parallel Programs.
LCPC 1997: 49-63 |
21 | | Arun Venkatachar,
J. Ramanujam,
Ashwath Thirumalai:
Communication Generation for Block-Cyclic Distributions.
Parallel Processing Letters 7(2): 195-202 (1997) |
1996 |
20 | | Rajesh Bordawekar,
Alok N. Choudhary,
J. Ramanujam:
A Framework for Integrated Communication and I/O Placement.
Euro-Par, Vol. I 1996: 541-552 |
19 | EE | Rajesh Bordawekar,
Alok N. Choudhary,
J. Ramanujam:
Automatic Optimization of Communication in Compiling Out-of-Core Stencil Codes.
International Conference on Supercomputing 1996: 366-373 |
18 | | Arun Venkatachar,
J. Ramanujam,
Ashwath Thirumalai:
Generalized Overlap Regions for Communication Optimization in Data-Parallel Programs.
LCPC 1996: 404-419 |
17 | EE | Rajeev Thakur,
Alok N. Choudhary,
J. Ramanujam:
Efficient Algorithms for Array Redistribution.
IEEE Trans. Parallel Distrib. Syst. 7(6): 587-594 (1996) |
16 | | Ashwath Thirumalai,
J. Ramanujam:
Efficient Computation of Address Sequences in Data Parallel Programs Using Closed Forms for Basis Vectors.
J. Parallel Distrib. Comput. 38(2): 188-203 (1996) |
15 | | Rajesh Bordawekar,
Alok N. Choudhary,
J. Ramanujam:
Compilation and Communication Strategies for Out-of-Core Programs on Distributed Memory Machines.
J. Parallel Distrib. Comput. 38(2): 277-288 (1996) |
1995 |
14 | EE | J. Ramanujam,
S. Vasanthakumar:
Statement-level independent partitioning of uniform recurrences.
IPPS 1995: 229-233 |
13 | EE | S. D. Kaushik,
Chua-Huang Huang,
J. Ramanujam,
P. Sadayappan:
Multi-phase array redistribution: modeling and evaluation.
IPPS 1995: 441-445 |
12 | | Ashwath Thirumalai,
J. Ramanujam:
Fast Address Sequence Generation for Data-Parallel Programs Using Integer Lattices.
LCPC 1995: 191-208 |
11 | | J. Ramanujam,
Amit Narayan:
Integrating Data Distribution and Loop Transformations.
PPSC 1995: 668-673 |
1994 |
10 | | J. Ramanujam:
Optimal Software Pipelining of Nested Loops.
IPPS 1994: 335-342 |
9 | | J. Ramanujam,
A. Mathew:
Analysis of Event Synchronization in Parallel Programs.
LCPC 1994: 300-315 |
1992 |
8 | | J. Ramanujam:
Non-Unimodular Transformations of Nested Loops.
SC 1992: 214-223 |
7 | | J. Ramanujam,
P. Sadayappan:
Tiling Multidimensional Itertion Spaces for Multicomputers.
J. Parallel Distrib. Comput. 16(2): 108-120 (1992) |
1991 |
6 | | J. Ramanujam:
A Linear Algebraic View of Loop Transformations and Their Interaction.
PPSC 1991: 543-548 |
5 | EE | J. Ramanujam,
P. Sadayappan:
Tiling multidimensional iteration spaces for nonshared memory machines.
SC 1991: 111-120 |
4 | EE | J. Ramanujam,
P. Sadayappan:
Compile-Time Techniques for Data Distribution in Distributed Memory Machines.
IEEE Trans. Parallel Distrib. Syst. 2(4): 472-482 (1991) |
1990 |
3 | | J. Ramanujam,
P. Sadayappan:
Tiling of Iteration Spaces for Multicomputers.
ICPP (2) 1990: 179-186 |
2 | | Fikret Erçal,
J. Ramanujam,
P. Sadayappan:
Task Allocation onto a Hypercube by Recursive Mincut Bipartitioning.
J. Parallel Distrib. Comput. 10(1): 35-44 (1990) |
1 | | P. Sadayappan,
Fikret Erçal,
J. Ramanujam:
Cluster partitioning approaches to mapping parallel programs onto a hypercube.
Parallel Computing 13(1): 1-16 (1990) |