Tuesday, June 14 |
|
|
08:00-09:00 |
Registration, Georg Sverdrup's house |
|
08:45-17:30 |
Workshop on Performance Evaluation of Networks for Parallel, Cluster and Grid Computing Systems, auditorium 2 |
|
09:00-17:00 |
The 2nd International Workshop on Embedded Computing, room 3512 |
|
09:00-17:00 |
|
|
09:20-17:00 |
Workshop on and Grid Services for Scientific Data Analysis, room 3511 |
Wednesday, June 15 |
|||
|
08:00-09:00 |
Registration, Georg Sverdrup's house |
||
|
09:00-17:00 |
The 7th Workshop on High Performance Scientific and Engineering Computing, room 3511 |
||
|
09:00-09:30 |
Opening Remarks & Awards, auditorium 1 |
||
|
09:30-10:30 |
Keynote: Multi-Core Chips: The Next Wave of Processor Microarchitecture Antonio González, Director, Intel-UPC Barcelona Research Center, auditorium 1 |
||
|
10:30-11:00 |
Coffee break |
||
|
|
Auditorium 2 |
Room 3512 |
Room 3513 |
|
11:00-12:00 |
Session 1A: Fusun Ozguner Unconventional Scheduling |
Session 1B: Renato Ferreira Resource Allocation |
Session 1C: Yuanyuan Yang Overlays |
|
|
SAREC: A Security-Aware Scheduling Strategy for Real-Time Applications on Clusters. T. Xie, X. Qin, and A. Sung. |
Two-Tier Resource Allocation for Slowdown Differentiation on Server Clusters. X. Zhou, Y. Cai, C. Chow, and M. Augusteijn. |
Design and Implementation of Overlay Multicast Protocol for Multimedia Streaming. T. Baduge, A. Hiromori, H. Yamaguchi, and T. Higashino. |
|
|
Multiprocessor Energy-Efficient Scheduling for Real-Time Tasks with Different Power Characteristics. J.-J. Chen and T.-W. Kuo. |
|
Embedding a Cluster-based Overlay Mesh in Mobile Ad hoc Networks without Cluster Heads. Amit Banerjee, Chung-Ta King, and Hung-Chang Hsiao. |
|
12:00-13:30 |
Lunch break |
||
|
|
Auditorium 2 |
Room 3512 |
Room 3513 |
|
13:30-15:00 |
Session 2A: Antonio Gonzalez Processor Architecture |
Session 2B: Mark Gardner Compilers and Languages |
Session 2C: Xing Cai Applications |
|
|
Exploring Processor Design Options for Java Based Middleware. Martin Karlsson, Kevin Moore, Erik Hagersten, and David Wood. |
Tuning High Performance Kernels through Empirical Compilation. R. Clint Whaley and David B. Whalley. |
Integrated Performance Monitoring of a Cosmology Application on Leading HEC Platforms. Julian Borrill, Jonathan Carter, Leonid Oliker, David Skinner, and Rupak Biswas. |
|
|
A Vector-uSIMD-VLIW Architecture for Multimedia Applications. |
A Novel Approach
for Detecting Heap-based Loop-carried Dependences. |
First Evaluation
of Parallel Methods of Automatic Global Image Registration Based on Wavelets.
|
|
|
Design Tradeoffs
for BLAS Operations on Reconfigurable Hardware. |
Enabling Loop Fusion and Tiling for Cache Performance by Fixing
Fusion-Preventing Data Dependences. |
Parallel
Algorithm and Implementation for Realtime Dynamic Simulation of Power System. |
|
15:00-15:30 |
Coffee break |
||
|
15:30-17:00 |
Session 3A: Rafael Asenjo On-Chip Parallelism |
Session 3B: Tor Skeie Messaging |
Session 3C: Tarik Cicic Virtual & Optical Networking |
|
|
Heuristics for Profile-Driven Method-Level Speculative Parallelization. J. Whaley and C. Kozyrakis. |
A Preliminary Analysis of the MPI Queue Characteristics of Several Applications. Ron Brightwell, Sue Goudy, and Keith D. Underwood. |
Constructing Battery-Aware Virtual Backbones in Sensor Networks. Chi Ma and Yuanyuan Yang. |
|
|
A Complexity-Effective Simultaneous Multithreading Architecture. Carmelo Acosta, Ayose Falcon, Alex Ramirez and Mateo Valero. |
LiMIC : Support for High-Performance MPI Intra-Node Communication on Linux Cluster. Hyun-Wook Jin, Sayantan Sur, Lei Chai, and Dhabaleswar K. Panda. |
Provisioning Virtual Private Networks in the Hose Model with Delay Requirements. Lei Zhang, Jogesh Muppala and Samuel Chanson. |
|
|
Construction and Compression of Complete Call Graphs for Post-Mortem Program Trace Analysis. Andreas Knuepfer and Wolfgang E. Nagel. |
A Smart TCP Socket for Distributed Computing. Shao Tao and A.L. Ananda. |
On Mapping Multidimensional Weak Tori on Optical Slab Waveguides. Ramachandran Vaidyanathan and Karthik Sethuraman. |
|
1710 |
Bus to the City Hall. The bus will wait just outside the campus in the crossing between the streets Blindernveien and Moltke Moes vei. |
||
|
17:30-19:00 |
Reception in the City Hall, (Business attire) |
||
|
19:15-20:30 |
Boat Trip. Sightseeing on the Oslofjord |
||
Thursday, June 16 |
|||
|
09:00-17:00 |
International Workshop on Wireless and Sensor Networks, room 3511 |
||
|
09:00-10:00 |
Keynote: Making Quake II Massively Multiplayer with OptimalGrid, Glenn Deen, Computer Science, IBM Almaden Research Center - San Jose California. Auditorium 1 |
||
|
10:00-10:30 |
Coffee break |
||
|
|
Auditorium 2 |
Room 3512 |
Room 3513 |
|
10:30-12:00 |
Session 4A: Jingling XueShared-Memory Computing |
Session 4B: Mourad Alia P2P Search & Discovery |
Session 4C: Juan Carlos Cano Ad-Hoc Networks |
|
|
Performance Evaluation of the SGI Altix 3700. Thomas H. Dunigan, Jr, Jeffrey S. Vetter and Patrick H. Worley. |
A C/S and P2P Hybrid Resource Discovery Framework in Grid Environments. Yili Gong, Wei Li, Yuzhong Sun, and Zhiwei Xu. |
BluePower - A New Distributed Multihop Scatternet Formation Protocol for Bluetooth Networks. Yuanrui Zhang, Shu Liu, Weijia Jia, and Xu Cheng. |
|
|
Fast Barriers for Scalable ccNUMA Systems. Liqun Cheng and John B. Carter. |
Differentiated Search in Hierarchical Peer to Peer Networks. Chen Wang, Li Xiao, and Pei Zheng. |
A Compatible and Scalable Clock Synchronization Protocol in IEEE 802.11 Ad Hoc Networks. Dong Zhou and Ten-Hwang Lai. |
|
|
Performance Evaluation of View-Oriented Parallel Programming. Z. Huang, M. Purvis, and P. Werstein. |
A Hybrid Searching Scheme in Unstructured P2P Networks. Xiuqi Li and Jie Wu. |
Single Path Flooding Chain Routing in Mobile Ad Hoc Networks. Ming Ma, Yuanyuan Yang and Chi Ma. |
|
12:00-13:30 |
Lunch break |
||
|
13:30-15:00 |
Session 5A: D. K. Panda Network Performance and Features |
Session 5B: John Carter Network-Based & Grid Computing |
Session 5C: Yunhao Liu Wireless & Mobile Computing |
|
|
Supporting the Sockets Interface over User-level Communication Architecture: Design Issues and Performance Comparisons. J.-W. Jang and J.-S. Kim. |
Session-Based Adaptive Overload Control for Secure Dynamic Web Applications. Jordi Guitart, David Carrera, Vicenç Beltran, Jordi Torres, and Eduard Ayguadé. |
Connected k-Hop Clustering in Ad Hoc Networks. Shuhui Yang, Jie Wu and and Jiannong Cao. |
|
|
An Empirical Approach for Efficient All-to-All Personalized Communication on Ethernet Switched Clusters. Ahmad Faraj and Xin Yuan. |
Efficient Switching Supports of Distributed .NET Remoting with Network Processors. Chung-Kai Chen, Yu-Hao Chang, Cheng-Wei Chen, Yu-Tin Chen, Chih-Chieh Yang, and Jenq-Kuen Lee. |
Stimulus-Based Adaptive Sleeping for Wireless Sensor Networks. H. Ngan, Y. Zhu, L. Ni, and R. Xiao. |
|
|
Considering the Relative Importance of Network Performance and Network Features. Bill Lawry and Keith D. Underwood. |
Service Migration in Distributed Virtual Machines for Adaptive Grid Computing. Song Fu and Cheng-Zhong Xu. |
A New Service Classification Strategy in Hybrid Scheduling to Support Differentiated QoS in Wireless Data Networks. N. Saxena, K. Basu, S. Das, and C. Pinotti. |
|
15:00-15:30 |
Coffee break |
||
|
15:30-17:00 |
Panel Session: How Will Video Games and Multimedia Drive the HPC Market? Auditorium 1 |
||
|
19:00 |
Conference dinner, Gamle Logen, Grev Wedels plass 2, Oslo. Please notify the secretariat if you are not participating. |
||
Friday, June 17 |
|||
|
09:00-17:00 |
4th Workshop on Compile and Runtime Techniques for Parallel Computing, room 3511 |
||
|
09:00-10:00 |
Keynote: Cell Processor: Motivation, Architecture, Design, Programming and Applications. H. Peter Hofstee, IBM Systems and Technology Group, auditorium 1 |
||
|
10:00-10:30 |
Coffee break |
||
|
|
Auditorium 2 |
Room 3512 |
Room 3513 |
|
10:30-12:00 |
Session 6A: Hideharu AmanoHigher-Level Network Operations |
Session 6B: Jie Wu Services in P2P Systems |
Session 6C: Jeff Vetter Communications Tools |
|
|
Optimizing Collective Communications on SMP Clusters. M.-S. Wu, R. Kendall, and K. Wright. |
Ferry: An Architecture for Content-Based Publish/Subscribe Services on P2P Networks. Yingwu Zhu and Yiming Hu. |
Low Overhead High Performance Runtime Monitoring of Collective Communication. Lars Ailo Bongo, Otto J. Anshus, and John Markus Bjørndalen. |
|
|
Distributed queue-based locking using advanced network features. A. Devulapalli and P. Wyckoff. |
Distributed Access Control in CROWN Groups. J. Huai, Y. Zhang, X. Li, and Y. Liu. |
Automatic Experimental Analysis of Communication Patterns in Virtual Topologies. Nikhil Bhatia, Fengguang Song, Felix Wolf, Jack Dongarra, Bernd Mohr, and Shirley Moore. |
|
|
Clustered DBMS Scalability under Unified Ethernet Fabric. Krishna Kant and Amit Sahoo. |
A Peer-to-Peer Replica Management Service for High-Throughput Grids. Antony Chazapis, Antonis Zissimos, and Nectarios Koziris. |
Design and Implementation of a Parallel Performance Data Management Framework. Kevin Huck, Allen D. Malony, Robert Bell, and Alan Morris. |
|
12:00-13:30 |
Lunch |
||
|
13:30-15:00 |
Session 7A: Francisco J. Quiles Network Hardware |
Session 7B: Nectarios Koziris Peer-to-Peer Technology |
Session 7C: Lionel Ni Algorithms & Applications |
|
|
PFED: A Prediction-Based Fair Active Queue Management Algorithm. W. Gao, J. Wang, J. Chen, and S. Chen. |
PeerWindow: An Efficient, Heterogeneous, and Autonomic Node Collection Protocol. J. Hu, M. Li, H. Yu, H. Dong, and W. Zheng |
Filter Decomposition for Supporting Coarse-Grained Pipelined. Wei Du and Gagan Agrawal. |
|
|
Toward Effective NIC Caching: A Hierarchical Data Cache Architecture for iSCSI Storage Servers. Xiaoyu Yao, Peng Gu and Jun Wang. |
Locality-Aware Randomized Load Balancing Algorithms for DHT Networks. H. Shen and C.-Z. Xu. |
On the Architectural Requirements for Efficient Execution of Graph Algorithms. David A. Bader, Guojing Cong, and John Feo. |
|
|
A New Fault Information Model for Fault-Tolerant Adaptive and Minimal Routing in 3-D Meshes. Zhen Jiang, Jie Wu, and Dajin Wang. |
Caching Routing Indices in Structured P2P Overlays. Hailong Cai and Jun Wang. |
Scalability of Heterogeneous Computing. Xian-He Sun, Yong Chen, and Ming Wu. |
|
15:00-15:30 |
Coffee break |
||
|
15:30-17:00 |
Session 8A: Rafael Casado Interconnection Networks |
Session 8B: Daniel S. Katz Cross-Node Clustering |
Session 8C: David A. Bader Scheduling |
|
|
VLAN-based Minimal Paths in PC Cluster with Ethernet on Mesh and Torus. Tomohiro Otsuka, Michihiro Koibuchi, Akiya Jouraku, and Hideharu Amano. |
Impact of Exploiting Load Imbalance on Coscheduling in Workstation Clusters. Jung-Lok Yu, Driss Azougagh, Jin-Soo Kim, and Seung-Ryoul Maeng. |
An ACO-based approach for scheduling task graphs with communication costs. Markus Bank, Udo Hoenig, and Wolfram Schiffmann. |
|
|
Fault-Tolerant Routing in Meshes/Tori Using Planarly Constructed Fault Blocks. Dong Xiang, Jia-guang Sun, Jie Wu, and Krishnaiyan Thulasiraman. |
Push-Pull: A Guided Search DAG Scheduling Algorithm for Heterogeneous Cluster Systems. Sang Cheol Kim and Sunggu Lee. |
A Task Duplication Based Scheduling Algorithm Using Partial Schedules. Doruk Bozdag, Fusun Ozguner, Eylem Ekici, and Umit Catalyurek. |
|
|
Peak Power Control for a QoS Capable On-Chip Network. Yuho Jin, Eun Jung Kim, and Ki Hwan Yum. |
Incremental Parallelization Using Navigational Programming: A Case Study. Lei Pan, Wenhui Zhang, Arthur Asuncion, Ming King Lai, Michael B. Dillencourt, and Lubomir F. Bic. |
Scheduling Data Flow Applications using Linear Programming. Luiz Thomaz do Nascimento, Renato A Ferreira, Wagner Meira Jr, and Dorgival Guedes. |
Wednesday, June 15
Antonio González, Director, Intel-UPC Barcelona Research Center
Advances in semiconductor process technology keep driving Moore’s Law and provide a doubling of the transistor density about every two years. These advances have fueled the evolution in processor microarchitecture. In the past we have seen the transition from CISC to RISC, and later to superscalar organizations. More recently we are witnessing an increased emphasis on exploiting thread-level parallelism through the so-called multi-core chips. Multicore chips will become common in all market segments, from high-end servers to desktop and mobile PCs.
This talk will discuss some of the key challenges and opportunities presented by multi-core processors. In particular, aspects related to scalability, adaptability, programmability and reliability will be discussed.
Thursday, June 16
Glenn Deen, Computer Science, IBM Almaden Research Center - San Jose California
"This
talk will discuss the motivation, challenges, and results of the OptimalGrid
project's effort at IBM Research to enable massively multiplayer games on
grids. The popular open source game engine Quake II from Id Software was
transformed from a single server supporting a few players, into a massively
multiplayer on-line game engine utilizing the resources of a grid to provide a
single game world which dynamical scales to meet the needs of players. To
achieve this, the OptimalGrid autonomic grid middleware provided the underlying
grid, and features such as game world partitioning, and dynamic load balancing.
This enabled the massively multiplayer version to meet game performance needs
by adding and removing servers, and by moving portions of the game world to
under utilized servers during live game play, while remaining invisible to
players. The OptimalGrid approach of providing an autonomic middleware hiding
the underlying complexities of the grid made this enhancement of Quake II into
a massively multiplayer game possible, while requiring no changes to the Quake
II core design or to its fundamental workings."
Date: Friday, June 17
H. Peter Hofstee, IBM Systems and Technology Group
Cell
Processor: Motivation, Architecture, Design, Programming and Applications
This
talk will present the Cell processor, jointly developed by the STI
(Sony-Toshiba-
IBM)
partnership. Cell is a non-homogeneous chip multiprocessor intended for general-purpose applications but with a
particular emphasis on multimedia performance. The Cell processor combines a 64bit Power
Architecture(TM) core with 8 Synergistic Processors. In many cases delivers more than an order
of magnitude more performance than conventional PC
processors. Cell achieves this performance and power efficiency improvement by a new division of labor between
the Power core and the Synergistic Processors.
Cell allows for a wide variety of programming models, a selection of which will be presented in this talk. We will end the
talk by discussing some applications that seem to fit the Cell processor particularly well, and
by indicating areas of further exploration.