2009 |
108 | EE | Per Stenström:
Transactions on High-Performance Embedded Architectures and Compilers II
Springer 2009 |
107 | EE | Martin Thuresson,
Magnus Själander,
Per Stenström:
A Flexible Code Compression Scheme Using Partitioned Look-Up Tables.
HiPEAC 2009: 95-109 |
106 | EE | M. M. Waliullah,
Per Stenström:
Schemes for avoiding starvation in transactional memory systems.
Concurrency and Computation: Practice and Experience 21(7): 859-873 (2009) |
105 | EE | Per Stenström,
David B. Whalley:
Introduction.
T. HiPEAC 2: 3 (2009) |
2008 |
104 | | Per Stenström,
Michel Dubois,
Manolis Katevenis,
Rajiv Gupta,
Theo Ungerer:
High Performance Embedded Architectures and Compilers, Third International Conference, HiPEAC 2008, Göteborg, Sweden, January 27-29, 2008, Proceedings
Springer 2008 |
103 | EE | Martin Thuresson,
Per Stenström:
Accommodation of the Bandwidth of Large Cache Blocks Using Cache/Memory Link Compression.
ICPP 2008: 478-486 |
102 | EE | M. M. Waliullah,
Per Stenström:
Intermediate checkpointing with conflicting access prediction in transactional memory systems.
IPDPS 2008: 1-11 |
101 | EE | Reinhard Wilhelm,
Jakob Engblom,
Andreas Ermedahl,
Niklas Holsti,
Stephan Thesing,
David B. Whalley,
Guillem Bernat,
Christian Ferdinand,
Reinhold Heckmann,
Tulika Mitra,
Frank Mueller,
Isabelle Puaut,
Peter P. Puschner,
Jan Staschulat,
Per Stenström:
The worst-case execution-time problem - overview of methods and survey of tools.
ACM Trans. Embedded Comput. Syst. 7(3): (2008) |
100 | EE | Martin Thuresson,
Lawrence Spracklen,
Per Stenström:
Memory-Link Compression Schemes: A Value Locality Perspective.
IEEE Trans. Computers 57(7): 916-927 (2008) |
99 | EE | Fredrik Warg,
Per Stenström:
Dual-thread Speculation: A Simple Approach to Uncover Thread-level Parallelism on a Simultaneous Multithreaded Processor.
International Journal of Parallel Programming 36(2): 166-183 (2008) |
98 | EE | Md. Mafijul Islam,
Magnus Själander,
Per Stenström:
Early detection and bypassing of trivial operations to improve energy efficiency of processors.
Microprocessors and Microsystems - Embedded Hardware Design 32(4): 183-196 (2008) |
2007 |
97 | | Utpal Banerjee,
José Moreira,
Michel Dubois,
Per Stenström:
Proceedings of the 4th Conference on Computing Frontiers, 2007, Ischia, Italy, May 7-9, 2007
ACM 2007 |
96 | | Per Stenström,
Michael F. P. O'Boyle,
François Bodin,
Marcelo Cintra,
Sally A. McKee:
Transactions on High-Performance Embedded Architectures and Compilers I
Springer 2007 |
95 | | Koen De Bosschere,
David R. Kaeli,
Per Stenström,
David B. Whalley,
Theo Ungerer:
High Performance Embedded Architectures and Compilers, Second International Conference, HiPEAC 2007, Ghent, Belgium, January 28-30, 2007, Proceedings
Springer 2007 |
94 | EE | Marco Galluzzi,
Enrique Vallejo,
Adrián Cristal,
Fernando Vallejo,
Ramón Beivide,
Per Stenström,
James E. Smith,
Mateo Valero:
Implicit Transactional Memory in Kilo-Instruction Multiprocessors.
Asia-Pacific Computer Systems Architecture Conference 2007: 339-353 |
93 | EE | Shekhar Borkar,
Norman P. Jouppi,
Per Stenström:
Microprocessors in the era of terascale integration.
DATE 2007: 237-242 |
92 | EE | M. M. Waliullah,
Per Stenström:
Starvation-Free Transactional Memory-System Protocols.
Euro-Par 2007: 280-291 |
91 | EE | Haakon Dybdahl,
Per Stenström:
An Adaptive Shared/Private NUCA Cache Partitioning Scheme for Chip Multiprocessors.
HPCA 2007: 2-12 |
90 | EE | Md. Mafijul Islam,
Alexander Busck,
Mikael Engbom,
Simji Lee,
Michel Dubois,
Per Stenström:
Loop-level Speculative Parallelism in Embedded Applications.
ICPP 2007: 3 |
89 | EE | Martin Thuresson,
Magnus Själander,
Magnus Björk,
Lars J. Svensson,
Per Larsson-Edefors,
Per Stenström:
FlexCore: Utilizing Exposed Datapath Control for Efficient Computing.
ICSAMOS 2007: 18-25 |
88 | EE | Per Stenström:
IPDPS Panel: Is the Multi-Core Roadmap going to Live Up to its Promises?
IPDPS 2007: 14 |
87 | EE | Md. Mafijul Islam,
Per Stenström:
Energy and Performance Trade-offs between Instruction Reuse and Trivial Computations for Embedded Applications.
SIES 2007: 86-93 |
86 | EE | Jianwei Chen,
Michel Dubois,
Per Stenström:
SimWattch: Integrating Complete-System and User-Level Performance and Power Simulators.
IEEE Micro 27(4): 34-48 (2007) |
85 | EE | Jochen Hollmann,
Anders Ardö,
Per Stenström:
Effectiveness of caching in a distributed digital library system.
Journal of Systems Architecture 53(7): 403-416 (2007) |
84 | EE | Per Stenström:
Introduction to Part 1.
T. HiPEAC 1: 33 (2007) |
83 | EE | Koen De Bosschere,
Wayne Luk,
Xavier Martorell,
Nacho Navarro,
Michael F. P. O'Boyle,
Dionisios N. Pnevmatikatos,
Alex Ramírez,
Pascal Sainrat,
André Seznec,
Per Stenström,
Olivier Temam:
High-Performance Embedded Architecture and Compilation Roadmap.
T. HiPEAC 1: 5-29 (2007) |
2006 |
82 | EE | Haakon Dybdahl,
Per Stenström:
Enhancing Last-Level Cache Performance by Block Bypassing and Early Miss Determination.
Asia-Pacific Computer Systems Architecture Conference 2006: 52-66 |
81 | EE | Jaeheon Jeong,
Per Stenström,
Michel Dubois:
Simple penalty-sensitive replacement policies for caches.
Conf. Computing Frontiers 2006: 341-352 |
80 | EE | Per Stenström:
Chip-multiprocessing and beyond.
HPCA 2006: 109 |
79 | EE | Haakon Dybdahl,
Per Stenström,
Lasse Natvig:
A Cache-Partitioning Aware Replacement Policy for Chip Multiprocessors.
HiPC 2006: 22-34 |
78 | EE | Md. Mafijul Islam,
Per Stenström:
Reduction of Energy Consumption in Processors by Early Detection and Bypassing of Trivial Operations.
ICSAMOS 2006: 28-34 |
77 | EE | Martin Thuresson,
Per Stenström:
Scalable Value-Cache Based Compression Schemes for Multiprocessors.
SBAC-PAD 2006: 117-124 |
76 | EE | Fredrik Warg,
Per Stenström:
Dual-Thread Speculation: Two Threads in the Machine are Worth Eight in the Bush.
SBAC-PAD 2006: 91-98 |
75 | EE | Burkhard Monien,
Guang Gao,
Horst Simon,
Paul G. Spirakis,
Per Stenström:
Introduction.
J. Parallel Distrib. Comput. 66(5): 615-616 (2006) |
2005 |
74 | EE | Fredrik Warg,
Per Stenström:
Reducing misspeculation overhead for module-level speculative execution.
Conf. Computing Frontiers 2005: 289-298 |
73 | EE | Martin Thuresson,
Per Stenström:
Evaluation of extended dictionary-based static code compression schemes.
Conf. Computing Frontiers 2005: 77-86 |
72 | EE | Per Stenström:
The Chip-Multiprocessing Paradigm Shift: Opportunities and Challenges.
HiPEAC 2005: 5 |
71 | EE | Magnus Ekman,
Per Stenström:
A Cost-Effective Main Memory Organization for Future Servers.
IPDPS 2005 |
70 | EE | Magnus Ekman,
Per Stenström:
A Robust Main-Memory Compression Scheme.
ISCA 2005: 74-85 |
69 | EE | Magnus Ekman,
Per Stenström:
Enhancing Multiprocessor Architecture Simulation Speed Using Matched-Pair Comparison.
ISPASS 2005: 89-99 |
68 | EE | Frank Mueller,
Per Stenström:
Introduction to the special issue.
ACM Trans. Embedded Comput. Syst. 4(1): 1-2 (2005) |
2004 |
67 | EE | Martin Kämpe,
Per Stenström,
Michel Dubois:
Self-correcting LRU replacement policies.
Conf. Computing Frontiers 2004: 181-191 |
66 | EE | Magnus Ekman,
Per Stenström:
A case for multi-level main memory.
WMPI 2004: 1-8 |
65 | EE | Håkan Grahn,
Per Stenström:
A comparative evaluation of hardware-only and software-only directory protocols in shared-memory multiprocessors.
Journal of Systems Architecture 50(9): 537-561 (2004) |
64 | EE | Jonas Jalminger,
Per Stenström:
A cache block reuse prediction scheme.
Microprocessors and Microsystems 28(7): 373-385 (2004) |
2003 |
63 | | Jochen Hollmann,
Anders Ardö,
Per Stenström:
An Evaluation of Document Prefetching in a Distributed Digital Library.
ECDL 2003: 276-287 |
62 | EE | Per Stenström:
One Chip, One Server: How Do We Exploit Its Power?
HiPC 2003: 405 |
61 | EE | Jonas Jalminger,
Per Stenström:
A Novel Approach to Cache Block Reuse Predictions.
ICPP 2003: 294- |
60 | EE | Magnus Ekman,
Per Stenström:
Performance and Power Impact of Issue-width in Chip-Multiprocessor Cores.
ICPP 2003: 359-368 |
59 | EE | Jim Nilsson,
Anders Landin,
Per Stenström:
The Coherence Predictor Cache: A Resource-Efficient and Accurate Coherence Prediction Infrastructure.
IPDPS 2003: 10 |
58 | EE | Peter Rundberg,
Per Stenström:
Speculative Lock Reordering: Optimistic Out-of-Order Execution of Critical Sections.
IPDPS 2003: 11 |
57 | EE | Fredrik Warg,
Per Stenström:
Improving Speculative Thread-Level Parallelism Through Module Run-Length Prediction.
IPDPS 2003: 12 |
56 | EE | Jianwei Chen,
Michel Dubois,
Per Stenström:
Integrating complete-system and user-level performance/power simulators: the SimWattch approach.
ISPASS 2003: 1-10 |
2002 |
55 | EE | Martin Kämpe,
Per Stenström,
Michel Dubois:
The FAB Predictor: Using Fourier Analysis to Predict the Outcome of Conditional Branches.
HPCA 2002: 223-232 |
54 | EE | Jochen Hollmann,
Anders Ardö,
Per Stenström:
Empirical Observations Regarding Predictability in User Access-Behavior in a Distributed Digital Library System.
IPDPS 2002 |
53 | EE | Magnus Ekman,
Per Stenström,
Fredrik Dahlgren:
TLB and snoop energy-reduction using virtual caches in low-power chip-multiprocessors.
ISLPED 2002: 243-246 |
52 | EE | Jonas Jalminger,
Per Stenström:
Improvement of energy-efficiency in off-chip caches by selective prefetching.
Microprocessors and Microsystems 26(3): 107-121 (2002) |
2001 |
51 | EE | Ulf Assarsson,
Per Stenström:
A Case Study of Load Distribution in Parallel View Frustum Culling and Collision Detection.
Euro-Par 2001: 663-673 |
50 | EE | Fredrik Warg,
Per Stenström:
Limits on Speculative Module-Level Parallelism in Imperative and Object-Oriented Programs on CMP Platforms.
IEEE PACT 2001: 221-230 |
49 | EE | Peter Rundberg,
Per Stenström:
An All-Software Thread-Level Data Dependence Speculation System for Multiprocessors.
J. Instruction-Level Parallelism 3: (2001) |
2000 |
48 | EE | Silvia M. Müller,
Per Stenström,
Mateo Valero,
Stamatis Vassiliadis:
Parallel Computer Architecture.
Euro-Par 2000: 537-538 |
47 | EE | Magnus Karlsson,
Fredrik Dahlgren,
Per Stenström:
A Prefetching Technique for Irregular Accesses to Linked Data Structures.
HPCA 2000: 206-217 |
46 | EE | Ashley Saulsbury,
Fredrik Dahlgren,
Per Stenström:
Recency-based TLB preloading.
ISCA 2000: 117-127 |
45 | EE | Magnus Karlsson,
Per Stenström:
An analytical model of the working-set sizes in decision-support systems.
SIGMETRICS 2000: 275-285 |
44 | | Per Stenström,
Erik Hagersten,
David J. Lilja,
Margaret Martonosi,
Madan Venugopal:
Shared-memory multiprocessing: Current state and future directions.
Advances in Computers 53: 2-55 (2000) |
43 | | Håkan Grahn,
Per Stenström:
Comparative Evaluation of Latency-Tolerating and -Reducing Techniques for Hardware-Only and Software-Only Directory Protocols.
J. Parallel Distrib. Comput. 60(7): 807-834 (2000) |
1999 |
42 | EE | Thomas Lundqvist,
Per Stenström:
Timing Anomalies in Dynamically Scheduled Microprocessors.
IEEE Real-Time Systems Symposium 1999: 12-21 |
41 | EE | Thomas Lundqvist,
Per Stenström:
A Method to Improve the Estimated Worst-Case Performance of Data Caching.
RTCSA 1999: 255-262 |
40 | | Jonas Skeppstedt,
Fredrik Dahlgren,
Per Stenström:
Evaluation of Compiler-Controlled Updating to Reduce Coherence-Miss Penalties in Shared-Memory Multiprocessors.
J. Parallel Distrib. Comput. 56(2): 122-143 (1999) |
39 | | Thomas Lundqvist,
Per Stenström:
An Integrated Path and Timing Analysis Method based on Cycle-Level Symbolic Execution.
Real-Time Systems 17(2-3): 183-207 (1999) |
1998 |
38 | EE | Thomas Lundqvist,
Per Stenström:
Integrating Path and Timing Analysis Using Instruction-Level Simulation Techniques.
LCTES 1998: 1-15 |
37 | | Fredrik Dahlgren,
Michel Dubois,
Per Stenström:
Performance Evaluation and Cost Analysis of Cache Protocol Extensions for Shared-Memory Multiprocessors.
IEEE Trans. Computers 47(10): 1041-1055 (1998) |
1997 |
36 | | Per Stenström,
Jonas Skeppstedt:
A Performance Tuning Approach for Shared-Memory Multiprocessors.
Euro-Par 1997: 72-83 |
35 | EE | Håkan Grahn,
Per Stenström:
Relative Performance of Hardware and Software-Only Directory Protocols Under Latency Tolerating and Reducing Techniques.
IPPS 1997: 500- |
34 | | Fredrik Dahlgren,
Per Stenström,
Mårten Björkman:
Reducing the Read-Miss Penalty for Flat COMA Protocols.
Comput. J. 40(4): 208-219 (1997) |
33 | | Per Stenström,
Erik Hagersten,
David J. Lilja,
Margaret Martonosi,
Madan Venugopal:
Trends in Shared Memory Multiprocessing.
IEEE Computer 30(12): 44-50 (1997) |
32 | | Per Stenström,
Mats Brorsson,
Fredrik Dahlgren,
Håkan Grahn,
Michel Dubois:
Boosting the Performance of Shared Memory Multiprocessors.
IEEE Computer 30(7): 63-70 (1997) |
31 | | Magnus Karlsson,
Per Stenström:
Effectivness of Dynamic Prefetching in Multiple-Writer Distributed Virtual Shared-Memory Systems.
J. Parallel Distrib. Comput. 43(2): 79-93 (1997) |
1996 |
30 | EE | Magnus Karlsson,
Per Stenström:
Performance Evaluation of a Cluster-Based Multiprocessor Built from ATM Switches and Bus-Based Multiprocessor Servers.
HPCA 1996: 4-13 |
29 | EE | Jonas Skeppstedt,
Per Stenström:
Using Dataflow Analysis Techniques to Reduce Ownership Overhead in Cache Coherence Protocols.
ACM Trans. Program. Lang. Syst. 18(6): 659-682 (1996) |
28 | | Per Stenström,
Fredrik Dahlgren:
Applications for Shared Memory Multiprocessors (Guest Editors' Introduction).
IEEE Computer 29(12): 29-31 (1996) |
27 | EE | Fredrik Dahlgren,
Per Stenström:
Evaluation of Hardware-Based Stride and Sequential Prefetching in Shared-Memory Multiprocessors.
IEEE Trans. Parallel Distrib. Syst. 7(4): 385-398 (1996) |
26 | | Håkan Grahn,
Per Stenström:
Evaluation of a Competitive-Update Cache Coherence Protocol with Migratory Data Detection.
J. Parallel Distrib. Comput. 39(2): 168-180 (1996) |
25 | | Mats Brorsson,
Per Stenström:
Characterising and Modelling Shared Memory Accesses in Multiprocessor Programs.
Parallel Computing 22(6): 869-893 (1996) |
1995 |
24 | EE | Mårten Björkman,
Fredrik Dahlgren,
Per Stenström:
Using hints to reduce the read miss penalty for flat COMA protocols.
HICSS (1) 1995: 242-251 |
23 | | Fredrik Dahlgren,
Per Stenström:
Effectiveness of Hardware-Based Stride and Sequential Prefetching in Shared-Memory Multiprocessors.
HPCA 1995: 68-77 |
22 | EE | Håkan Grahn,
Per Stenström:
Efficient Strategies for Software-Only Protocols in Shared-Memory Multiprocessors.
ISCA 1995: 38-47 |
21 | EE | Fredrik Dahlgren,
Michel Dubois,
Per Stenström:
Sequential Hardware Prefetching in Shared-Memory Multiprocessors.
IEEE Trans. Parallel Distrib. Syst. 6(7): 733-746 (1995) |
20 | | Fredrik Dahlgren,
Per Stenström:
Using Write Caches to Improve Performance of Cache Coherence Protocols in Shared-Memory Multiprocessors.
J. Parallel Distrib. Comput. 26(2): 193-210 (1995) |
19 | | Michel Dubois,
Jonas Skeppstedt,
Per Stenström:
Essential Misses and Data Traffic in Coherence Protocols.
J. Parallel Distrib. Comput. 29(2): 108-125 (1995) |
1994 |
18 | | Jonas Skeppstedt,
Per Stenström:
Simple Compiler Algorithms to Reduce Ownership Operhead in Cache Coherence Protocols.
ASPLOS 1994: 286-296 |
17 | | Per Stenström:
Introduction.
HICSS (1) 1994: 520-521 |
16 | | Fong Pong,
Per Stenström,
Michel Dubois:
An Integrated Methodology for the Verification of Directory-Based Cache Protocols.
ICPP (1) 1994: 158-165 |
15 | | Fredrik Dahlgren,
Per Stenström:
Reducing the Write Traffic for a Hybrid Cache Protocol.
ICPP (1) 1994: 166-173 |
14 | | Fredrik Dahlgren,
Michel Dubois,
Per Stenström:
Combined Performance Gains of Simple Cache Protocol Extensions.
ISCA 1994: 187-197 |
13 | | Håkan Nilsson,
Per Stenström:
An Adaptive Update-Based Cache Coherence Protocol for Reduction of Miss Rate and Traffic.
PARLE 1994: 363-374 |
1993 |
12 | | Fredrik Dahlgren,
Michel Dubois,
Per Stenström:
Fixed and Adaptive Sequential Prefetching in Shared Memory Multiprocessors.
ICPP 1993: 56-63 |
11 | | Per Stenström,
Mats Brorsson,
Lars Sandberg:
An Adaptive Cache Coherence Protocol Optimized for Migratory Sharing.
ISCA 1993: 109-118 |
10 | | Michel Dubois,
Jonas Skeppstedt,
Livio Ricciulli,
Krishnan Ramamurthy,
Per Stenström:
The Detection and Elimination of Useless Misses in Multiprocessors.
ISCA 1993: 88-97 |
1992 |
9 | | Per Stenström:
A Latency-Hiding Scheme for Multiprocessors with Buffered Multistage Networks.
IPPS 1992: 39-42 |
8 | | Per Stenström,
Truman Joe,
Anoop Gupta:
Comparative Performance Evaluation of Cache-Coherent NUMA and COMA Architectures.
ISCA 1992: 80-91 |
7 | | Håkan Nilsson,
Per Stenström:
The Scalable Tree Protocol - A Cache Coherence Approach for Large-Scale Multiprocessors.
SPDP 1992: 498-506 |
1991 |
6 | | Per Stenström,
Fredrik Dahlgren,
Lars Lundberg:
A Lockup-Free Multiprocessor Cache Design.
ICPP (1) 1991: 246-250 |
5 | EE | Fredrik Dahlgren,
Per Stenström:
On Reconfigurable On-Chip Data Caches.
MICRO 1991: 189-198 |
1990 |
4 | | Per Stenström:
A Survey of Cache Coherence Schemes for Multiprocessors.
IEEE Computer 23(6): 12-24 (1990) |
1989 |
3 | EE | Per Stenström:
A Cache Consistency Protocol for Multiprocessors with Multistage Networks.
ISCA 1989: 407-415 |
1988 |
2 | | Per Stenström:
Reducing Contention in Sharde-Memory Multiprocessors.
IEEE Computer 21(11): 26-37 (1988) |
1987 |
1 | | Per Stenström,
Lars Philipson:
A Layered Emulator for Design Evaluation of MIMD Multiprocessors with Shared Memory.
PARLE (1) 1987: 329-344 |