Tor Skeie

Clouds offer flexible and economically attractive compute and storage solutions for enterprises. However, the effectiveness of cloud computing for high-performance computing (HPC) systems still remains questionable. When clouds are... more

Clouds offer flexible and economically attractive compute and storage solutions for enterprises. However, the effectiveness of cloud computing for high-performance computing (HPC) systems still remains questionable. When clouds are deployed on lossless interconnection networks, like InfiniBand (IB), challenges related to load-balancing, low-overhead virtualization, and performance isolation hinder full potential utilization of the underlying interconnect. Moreover, cloud data centers incorporate a highly dynamic environment rendering static network reconfigurations, typically used in IB systems, infeasible. In this paper, we present a framework for a self-adaptive network architecture for HPC clouds based on lossless interconnection networks, demonstrated by means of our implemented IB prototype. Our solution, based on a feedback control and optimization loop, enables the lossless HPC network to dynamically adapt to the varying traffic patterns, current resource availability, workload distributions, and also in accordance with the service provider-defined policies. Furthermore, we present IBAdapt, a simplified ruled-based language for the service providers to specify adaptation strategies used by the framework. Our developed self-adaptive IB network prototype is demonstrated using state-of-the-art industry software. The results obtained on a test cluster demonstrate the feasibility and effectiveness of the framework when it comes to improving Quality-of-Service compliance in HPC clouds.

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Publication Name: IEEE Transactions on Parallel and Distributed Systems

Research Interests:
Distributed Computing and Computer Software

Exascale computing systems are being built with thousands of nodes. A key component of these systems is the interconnection network. The high number of components significantly increases the probability of failure. If failures occur in... more

Exascale computing systems are being built with thousands of nodes. A key component of these systems is the interconnection network. The high number of components significantly increases the probability of failure. If failures occur in the interconnection network, they may isolate a large fraction of the machine. For this reason, an efficient fault-tolerant mechanism is needed to keep the system interconnected, even in the presence of faults. A recently proposed topology for these large systems is the hybrid KNS family that provides supreme performance and connectivity at a reduced hardware cost. This paper present a fault-tolerant routing methodology for the KNS topology that degrades performance gracefully in the presence of faults and tolerates a reasonably large number of faults without disabling any healthy node. In order to tolerate network failures, the methodology uses a simple mechanism: for some source-destination pairs, only if necessary, packets are forwarded to the destination node through a set of intermediate nodes (without being ejected from the network) which allow avoiding faults. The evaluation results shows that the methodology tolerates a large number of faults. Furthermore, the methodology offers a gracious performance degradation. For instance, performance degrades only 1% for a 2D-network with 1024 nodes and 1% faulty links.

Publisher: IEEE

Publication Date: 2016

Publication Name: 2016 2nd IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB)

Research Interests:
Computer Science

As the size of high-performance computing systems grows, the number of events requiring a network reconfiguration, as well as the complexity of each reconfiguration, is likely to increase. In large systems, the probability of component... more

As the size of high-performance computing systems grows, the number of events requiring a network reconfiguration, as well as the complexity of each reconfiguration, is likely to increase. In large systems, the probability of component failure is high. At the same time, with more network components, ensuring high utilization of network resources becomes challenging. Reconfiguration in interconnection networks, like InfiniBand (IB), typically involves computation and distribution of a new set of routes in order to maintain connectivity and performance. In general, current routing algorithms do not consider the existing routes in a network when calculating new ones. Such configuration-oblivious routing might result in substantial modifications to the existing paths, and the reconfiguration becomes costly as it potentially involves a large number of source-destination pairs. In this paper, we propose a novel routing algorithm for IB based fat-tree topologies, SlimUpdate. SlimUpdate employs techniques to preserve existing forwarding entries in switches to ensure a minimal routing update, without any performance penalty, and with minimal computational overhead. We present an implementation of SlimUpdate in OpenSM, and compare it with the current de facto fat-tree routing algorithm. Our experiments and simulations show a decrease of up to 80% in the number of total path modifications when using SlimUpdate routing, while achieving similar or even better performance than the fat-tree routing in most reconfiguration scenarios.

Publisher: IEEE

Publication Date: 2015

Publication Name: 2015 IEEE International Conference on Cluster Computing

Research Interests:
Computer Science

Publisher: IEEE

Publication Date: 2015

Publication Name: 2015 IEEE International Conference on Cluster Computing

Research Interests:
Computer Science and InfiniBand

Download (.pdf)

ABSTRACT

Publication Date: 2009

Publication Name: Lecture Notes in Computer Science

Research Interests:
Computer Science and Complex network

Research Interests:
Computer Science and Fault Tolerant Routing

Download (.pdf)

ABSTRACT An increasing amount of interconnect technologies rely on source routing to forward packets through the network. It is therefore important to develop methods for fault tolerance that are well suited for source routed networks.... more

ABSTRACT An increasing amount of interconnect technologies rely on source routing to forward packets through the network. It is therefore important to develop methods for fault tolerance that are well suited for source routed networks. Dynamic fault tolerance allows the network to remain available through the occurrence of faults, as opposed to static fault tolerance which requires the network to be halted to reconfigure it. Source routing readily supports the source node choosing a different path when a fault occurs, but using this approach, packets already in the network will be lost. Local dynamic fault tolerance, where the packet is routed around the fault locally, would prevent much of the traffic being lost during failures, but this is cumbersome to achieve in source routed networks since packets encountering a fault will need to follow a path different from that encoded in the packet header. In this paper we present a mechanism to achieve local dynamic fault tolerance in source routed fat trees, a topology that has widespread use in supercomputer systems, and compare it with endpoint dynamic fault tolerance. We also show that by combining the two approaches we achieve performance superior to any of the two individually

Publication Date: 2006

Publication Name: 2006 18th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'06)

Research Interests:
Computer Networks, Network Dynamics, Fault Localization, and Fault Tolerant

Publication Date: 2005

Publication Name: 19th IEEE International Parallel and Distributed Processing Symposium

Research Interests:
Parallel, Network Topology, and Fault Tolerant

Download (.pdf)

Virtualization of computing resources is becoming increasingly important both for high-end servers and multi-core CPUs. In a virtualized system, the set of resources that constitute a virtual compute entity should be spatially separated... more

Virtualization of computing resources is becoming increasingly important both for high-end servers and multi-core CPUs. In a virtualized system, the set of resources that constitute a virtual compute entity should be spatially separated from each other. Dividing the cores on a chip, or the CPUs in a high end server into disjoint sets for each task is a trivial problem.

Publication Date: 2007

Publication Name: Lecture Notes in Computer Science

Research Interests:
Computer Science, High Performance Computing, Chip, and Network on chip

Interconnection networks play a key role in the fault toler- ance of massively parallel computers, since faults may isolate a large fraction of the machine containing many healthy nodes. In this paper, we present a methodology to design... more

Interconnection networks play a key role in the fault toler- ance of massively parallel computers, since faults may isolate a large fraction of the machine containing many healthy nodes. In this paper, we present a methodology to design fully adaptive fault-tolerant routing algorithms for direct interconnection networks that can be applied to dif- ferent regular topologies. The methodology is mainly based on the selec- tion of an intermediate node (if needed) for each source-destination pair. Packets are adaptively routed to the intermediate node and, from this node, they are adaptively forwarded to their destination. This methodol- ogy requires only one additional virtual channel, even for tori. Evaluation results show that the methodology is 7-fault tolerant, and for up to 14 faults, more than 99% of the combinations are tolerated, also without significantly degrading performance in the presence of faults.

Research Interests:
Computer Science, High Performance Computing, Routing algorithm, Adaptive Routing, Fault Tolerant, and 2 moreSpringer and Virtual Channel

Download (.pdf)

... Frank Olaf Sem-Jacobsen1,2, ˚Ashild Grønstad Solheim1,2, Olav Lysne1,2, Tor Skeie1,2, and Thomas Sødring2 1Department of Informatics 2Networks and Distributed Systems ... The second dataset is the continuous black line, which we call... more

... Frank Olaf Sem-Jacobsen1,2, ˚Ashild Grønstad Solheim1,2, Olav Lysne1,2, Tor Skeie1,2, and Thomas Sødring2 1Department of Informatics 2Networks and Distributed Systems ... The second dataset is the continuous black line, which we call the network performance ratio. ...

Publication Date: 2011

Publication Name: 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum

Research Interests:
Computer Science, Routing, Resource Allocation, Topology, Resource Management, and 7 moreThroughput, Network Performance, Data Center, Network Topology, Large Scale, Switches, and Resource Partitioning

Research Interests:
Computer Science, Load Balancing, Adaptive Routing, Hot Spot, InfiniBand, and 5 moreLoad Balance, Simulation experiment, CIC, Interconnect Networks, and irregular topology

Download (.pdf)

Publisher: IEEE

Publication Date: 2004

Publication Name: International Conference on Parallel Processing, 2004. ICPP 2004.

Research Interests:
Computer Science, Parallel Processing, Adaptive Routing, ICPP, Communication Complexity, and 3 moreFault Tolerant, Fault Tolerant Routing, and Probability of Failure

Download (.pdf)

a b s t r a c t Virtualization is the key to efficient resource utilization and elastic resource allocation in cloud comput- ing. It enables consolidation, the on-demand provisioning of resources, and elasticity through live mi- gration.... more

a b s t r a c t Virtualization is the key to efficient resource utilization and elastic resource allocation in cloud comput- ing. It enables consolidation, the on-demand provisioning of resources, and elasticity through live mi- gration. Live migration makes it possible to optimize resource usage by moving virtual machines (VMs) between physical servers in an application transparent manner. It does, however, require a flexible, high- performance, scalable virtualized I/O architecture to reach its full potential. This is challenging to achieve with high-speed networks such as InfiniBand and remote direct memory access enhanced Ethernet, be- cause these devices usually maintain their connection state in the network device hardware. Fortunately, the single root IO virtualization (SR-IOV) specification addresses the performance and scalability issues. With SR-IOV, each VM has direct access to a hardware assisted virtual device without the overhead in- troduced by emulation or para-virtualization. However, SR-IOV does not address the migration of the network device state. In this paper we present and evaluate the first available prototype implementation of live migration over SR-IOV enabled InfiniBand devices.

Publisher: Elsevier BV

Publication Date: 2015

Publication Name: Journal of Parallel and Distributed Computing

Research Interests:
Distributed Computing, Architecture, and Parallel & Distributed Computing

ABSTRACT Computer architectures for high performance computing have traditionally been based on an assumption of one parallel application running alone on one machine. The current trend is, however, that huge computer installations offer... more

ABSTRACT Computer architectures for high performance computing have traditionally been based on an assumption of one parallel application running alone on one machine. The current trend is, however, that huge computer installations offer compute power to a set of users or customers, each demanding only a subset of the available compute resources. This places new requirements on the architecture, in that it must support dynamic partitioning of the resources into several virtual servers as demand changes. We introduce a novel framework which supports flexible formation of such virtual servers while preventing interference between the communication of different virtual servers. This paper investigates the impacts of a shared interconnection network on applications running on virtual compute servers. We show that the interconnect performance supplied to each job is highly unpredictable, and that a job can experience a performance degradation of 97% when its traffic interferes with the traffic of concurrent jobs. With a minor reduction in the utilization of each processing node, this can be considerably improved through a combination of routing-containment in the interconnection network and a carefully designed resource allocation strategy.

Publisher: IEEE

Publication Date: 2009

Publication Name: 2009 International Conference on High Performance Computing (HiPC)

Research Interests:
Computer Science, Computer Architecture, and Resource Allocation

Publisher: IEEE

Publication Date: 2011

Publication Name: 2011 IEEE International Parallel & Distributed Processing Symposium

Research Interests:
Routing, Congestion Control, Routing algorithm, Topology, Hardware, and 5 moreAdaptive Routing, Hot Spot, Network Topology, Large Scale, and Bandwidth

Download (.pdf)

Publication Date: 2003

Publication Name: Proceedings International Parallel and Distributed Processing Symposium

Research Interests:
Computer Science, Routing, Scalability, Network Topology, Cost Function, and 2 moreSwitches and Cost Optimization

Download (.pdf)

Publisher: IEEE

Publication Date: 2009

Publication Name: 2009 IEEE International Symposium on Parallel & Distributed Processing

Research Interests:
Computer Science, Informatics, Computer Networks, Routing, Synchronization, and 5 moreConfiguration Management, Distributed Processing, Throughput, IPDPS, and Switches

Download (.pdf)

Publisher: IEEE

Publication Date: 2011

Publication Name: 2011 International Conference on Parallel Processing

Research Interests:
Computer Science and Congestion Management

Download (.pdf)

Publication Date: 2004

Publication Name: High Performance Computing - HiPC 2004

Research Interests:
Computer Science, Distributed Computing, High Performance Computing, Routing, Topology, and 8 moreDistributed System, Decode and Forward, Anomaly, Control Reconfiguration, Dynamic Networks, Virtual Channel, Deadlock, and Deterministic Approach

Download (.pdf)

Publication Date: 2003

Publication Name: Lecture Notes in Computer Science

Research Interests:
High Performance Computing, Quality of Service, Resource Reservation, and Admission Control

Download (.pdf)

Publication Date: 2004

Publication Name: Proceedings. Tenth International Conference on Parallel and Distributed Systems, 2004. ICPADS 2004.

Research Interests:
Computer Science, Routing, Routing algorithm, Packet Switching, Clustering Algorithms, and 14 moreAccess Networks, Computer Network, Performance Improvement, Multimedia Streaming, Availability, MESH NETWORK, Network Topology, Network structure, E Commerce, Switches, High Availability, Access Network, Application Software, and Virtual Channel

Download (.pdf)

Research Interests:
Computer Science and Boolean Satisfiability

Publisher: Elsevier BV

Publication Date: 2014

Publication Name: Journal of Parallel and Distributed Computing

Research Interests:
Computer Science, Distributed Computing, High Performance Computing, and Parallel & Distributed Computing

Download (.pdf)

Publisher: Citeseer

Publication Date: 1997

Publication Name: … of the International Conference on Parallel …

Research Interests:
Computer Science and Chip

Download (.pdf)

Massively parallel computing systems are being built with thousands of nodes. Because of the high number of components, it is critical to keep these systems running even in the presence of failures. Interconnection networks play a... more

Massively parallel computing systems are being built with thousands of nodes. Because of the high number of components, it is critical to keep these systems running even in the presence of failures. Interconnection networks play a key-role in these systems, and this paper proposes a fault-tolerant routing methodology for use in such networks. The methodology supports any minimal routing function

Publication Date: 2004

Publication Name: Network and Parallel Computing

Research Interests:
Computer Science, Flow Control, NPC, Adaptive Routing, Fault Tolerant, and 2 moreFault Tolerant Routing and Virtual Channel

Download (.pdf)

ABSTRACT A modern supercomputer or large-scale server consists of a huge set of components that perform processing functions and various forms of input/output and memory functions. All of the components unite in a complex collaboration to... more

ABSTRACT A modern supercomputer or large-scale server consists of a huge set of components that perform processing functions and various forms of input/output and memory functions. All of the components unite in a complex collaboration to perform the tasks of the entire system. The communication between these components that allows this collaboration to take place is supported by an infrastructure called the interconnection network.

Publication Date: 2010

Research Interests:
Fault Tolerant Routing

ABSTRACT A modern supercomputer or large-scale server consists of a huge set of components that perform processing functions and various forms of input/output and memory functions. All of the components unite in a complex collaboration to... more

ABSTRACT A modern supercomputer or large-scale server consists of a huge set of components that perform processing functions and various forms of input/output and memory functions. All of the components unite in a complex collaboration to perform the tasks of the entire system. The communication between these components that allows this collaboration to take place is supported by an infrastructure called the interconnection network.

Publication Date: 2009

Publication Name: Simula Research Laboratory

Research Interests:
Large Scale and Input Output

Understanding the nature of trac in high-speed communication systems is essential for achieving QoS in these networks. A rst step towards this goal is understanding how basic QoS mechanisms work and ae cts the network predict- ability... more

Understanding the nature of trac in high-speed communication systems is essential for achieving QoS in these networks. A rst step towards this goal is understanding how basic QoS mechanisms work and ae cts the network predict- ability before we introduce more complex mechanisms such as admission control. In this paper we analyse the ee ct of a Di- Serv inspired QoS concept applied to virtual cut-through net- works. The main ndings from our study are that (i) through- put dier entiation can be achieved by weighting of virtual lanes (VL) and by classifying VLs as either low or high priority, (ii) the balance between VL weighting and VL load is not crucial when the network is operating below the saturation point, (iii) jitter, however, is large and good jitter characteristics seems unachievable with such a relative scheme.

Research Interests:
Flow Control, Communication System, High Speed, and Admission Control

Download (.pdf)

Publication Date: 2011

End-to-end congestion control is the main method of congestion control in the Internet, and achieving consistent low queuing latency with end-to-end methods is a very active area of research. Even so, achieving consistent low queuing... more

End-to-end congestion control is the main method of congestion control in the Internet, and achieving consistent low queuing latency with end-to-end methods is a very active area of research. Even so, achieving consistent low queuing latency in the Internet still remains an unsolved problem. Therefore, we ask “What are the fundamental limits of end-to-end congestion control?” We find that the unavoidable queuing latency for bestcase end-to-end congestion control is on the order of hundreds of milliseconds under conditions that are common in the Internet. Our argument depends on two things: The latency of congestion signaling – at minimum the speed of light – and the fact that link capacity may change rapidly for an end-to-end path in the Internet.

Publisher: ArXiv

Publication Date: 2021

Publication Name: ArXiv

Research Interests:
Computer Science and arXiv

Download (.pdf)

In September 2020, the Broadband Forum published a new industry standard for measuring network quality. The standard centers on the notion of quality attenuation. Quality attenuation is a measure of the distribution of latency and packet... more

In September 2020, the Broadband Forum published a new industry standard for measuring network quality. The standard centers on the notion of quality attenuation. Quality attenuation is a measure of the distribution of latency and packet loss between two points connected by a network path. A vital feature of the quality attenuation idea is that we can express detailed application requirements and network performance measurements in the same mathematical framework. Performance requirements and measurements are both modeled as latency distributions. To the best of our knowledge, existing models of the 802.11 WiFi protocol do not permit the calculation of complete latency distributions without assuming steady-state operation. We present a novel model of the WiFi protocol. Instead of computing throughput numbers from a steady-state analysis of a Markov chain, we explicitly model latency and packet loss. Explicitly modeling latency and loss allows for both transient and steady-state anal...

Publication Date: Jan 29, 2021

Download (.pdf)

Nowadays, the use of multimedia applications that present QoS requirements is increasing rapidly. Advanced Switching (AS) is a new interconnection network technology that expands the capabilities of PCI Express. AS provides mechanisms... more

Nowadays, the use of multimedia applications that present QoS requirements is increasing rapidly. Advanced Switching (AS) is a new interconnection network technology that expands the capabilities of PCI Express. AS provides mechanisms that can be used to support QoS. Specifically, an AS fabric permits us to employ virtual channels, egress link scheduling, and an admission control mechanism to differentiate between traffic flows. In this paper we examine these mechanisms and show how to provide QoS based on bandwidth and latency requirements. Furthermore, a new algorithm, Weighted Fair Queuing Credit Aware, is proposed as a specific implementation of one of the schedulers suggested by the AS specification. 1

Publication Date: 2002

Download (.pdf)

Publisher: IEEE

Publication Name: 2021 17th International Conference on Network and Service Management (CNSM)

Download (.pdf)

In large high-performance computing systems, the probability of component failure is high. At the same time, for a sustained system performance, reconfiguration is often needed to ensure high utilization of available resources.... more

In large high-performance computing systems, the probability of component failure is high. At the same time, for a sustained system performance, reconfiguration is often needed to ensure high utilization of available resources. Reconfiguration in interconnection networks, like InfiniBand (IB), typically involves computation and distribution of a new set of routes in order to maintain connectivity and performance. In general, current routing algorithms do not consider the existing routes in a network when calculating new ones. Such configuration-oblivious routing might result in substantial modifications to the existing paths, and the reconfiguration becomes costly as it potentially involves a large number of source–destination pairs. In this paper, we propose a novel routing algorithm for IB-based fat-tree topologies, SlimUpdate. SlimUpdate employs path preservation techniques to achieve a decrease of up to 80 % in the number of total path modifications, as compared to the OpenSM’s fat-tree routing algorithm, in most reconfiguration scenarios. Furthermore, we present a metabase-aided re-routing method for fat-trees, based on destination leaf-switch multipathing. Our proposed method significantly reduces network reconfiguration overhead, while providing greater routing flexibility. On successive runs, our proposed method saves up to 85 % of the total routing time over the traditional re-routing scheme. Based on the metabase-aided routing, we also present a modified SlimUpdate routing algorithm to dynamically optimize routes for a given MPI node order.

Publisher: Springer Nature

Publication Date: 2016

Publication Name: The Journal of Supercomputing

Research Interests:
Distributed Computing and Supercomputing

Publication Date: 2001

Research Interests:
Engineering, Computer Network, Adaptive Routing, Scalability, Very high throughput, and 2 morePerformance Ratio and Crossbar

Download (.pdf)

Rerouting around faulty components and migration of jobs both require reconfiguration of data structures in the Queue Pairs residing in the hosts on an InfiniBand cluster. In this paper we report an implementation of dynamic... more

Rerouting around faulty components and migration of jobs both require reconfiguration of data structures in the Queue Pairs residing in the hosts on an InfiniBand cluster. In this paper we report an implementation of dynamic reconfiguration of such host side data-structures. Our implementation preserves the Queue Pairs, and lets the application run without being interrupted. With this implementation, we demonstrate

Publication Date: 2010

Research Interests:
Computer Science, Computer Networks, Cluster Computing, Fault Tolerance, Data Structure, and 3 moreFault Tolerant, Dynamic Reconfiguration, and Dynamic Networks

Publication Date: 2004

Research Interests:
Computer Science, Future Internet, Quality of Service, Multimedia Streaming, Admission Control, and 2 moreE Commerce and Access Network

The paper first gives an overview of the required functions for providing Internet connectivity and mobility management for mobile ad-hoc networks (MANETs). Internet gateway selection is one of these functions. Since multiple Internet... more

The paper first gives an overview of the required functions for providing Internet connectivity and mobility management for mobile ad-hoc networks (MANETs). Internet gateway selection is one of these functions. Since multiple Internet gateways might exist on the same MANET domain, a hybrid metric for Internet gateway selection is proposed as a replacement of the shortest hop-count metric. The hybrid metric provides load-balancing of intra/inter-MANET traffic. Simulation results show that ad-hoc routing protocols, using our proposed metric get better performance in terms of packet delivery ratio and transmission delay, at the cost of slightly increased signalling overhead.

Publication Date: 2008

Research Interests:
Computer Science, Traffic Simulation, Metrics, Mobile Ad Hoc Network, Mobility Management, and 7 moreAd Hoc Networks, Ad hoc network, Routing Protocol, Mobile Multimedia, Computer communication networks, Load Balance, and Packet Delivery Ratio

Publication Date: 2011

Research Interests: Computer Science, Chip, Interconnection, and Network on a Chip<div>()</div>

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Publication Date: 2006

Publication Name: IEEE Transactions on Parallel and Distributed Systems

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Publication Date: 2008

Publication Name: IEEE Transactions on Computers

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Publication Date: 2011

Publication Name: IEEE Transactions on Computers

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Publication Date: 2006

Publication Name: IEEE Transactions on Computers

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Publication Date: 2006

Publication Name: IEEE Communications Magazine

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Publication Date: 2006

Publication Name: IEEE Communications Magazine

Research Interests: Distributed Computing, Quality of Service, IEEE, IEEE Communications Magazine, Electrical And Electronic Engineering, and performance guarantee<div>()</div>

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Publication Date: 2008

Publication Name: Computer

Research Interests: Computer Science, System Architectures, Data Center, Data Centers, Utility Computing, and Computer<div>()</div>

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Publication Name: IEEE Transactions on Parallel and Distributed Systems

Research Interests: Distributed Computing and Computer Software<div>()</div>

Publisher: IEEE

Publication Date: 2016

Publication Name: 2016 2nd IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB)

Research Interests: Computer Science<div>()</div>

Publisher: IEEE

Publication Date: 2015

Publication Name: 2015 IEEE International Conference on Cluster Computing

Research Interests: Computer Science<div>()</div>

Publisher: IEEE

Publication Date: 2015

Publication Name: 2015 IEEE International Conference on Cluster Computing

Research Interests: Computer Science and InfiniBand<div>()</div>

Publication Date: 2009

Publication Name: Lecture Notes in Computer Science

Research Interests: Computer Science and Complex network<div>()</div>

Research Interests: Computer Science and Fault Tolerant Routing<div>()</div>

Publication Date: 2006

Publication Name: 2006 18th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'06)

Research Interests: Computer Networks, Network Dynamics, Fault Localization, and Fault Tolerant<div>()</div>

Publication Date: 2005

Publication Name: 19th IEEE International Parallel and Distributed Processing Symposium

Research Interests: Parallel, Network Topology, and Fault Tolerant<div>()</div>

Publication Date: 2007

Publication Name: Lecture Notes in Computer Science

Research Interests: Computer Science, High Performance Computing, Chip, and Network on chip<div>()</div>

Publication Date: 2011

Publication Name: 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum

Publisher: IEEE

Publication Date: 2004

Publication Name: International Conference on Parallel Processing, 2004. ICPP 2004.

Publisher: Elsevier BV

Publication Date: 2015

Publication Name: Journal of Parallel and Distributed Computing

Research Interests: Distributed Computing, Architecture, and Parallel & Distributed Computing<div>()</div>

Publisher: IEEE

Publication Date: 2009

Publication Name: 2009 International Conference on High Performance Computing (HiPC)

Research Interests: Computer Science, Computer Architecture, and Resource Allocation<div>()</div>

Publisher: IEEE

Publication Date: 2011

Publication Name: 2011 IEEE International Parallel & Distributed Processing Symposium

Publication Date: 2003

Publication Name: Proceedings International Parallel and Distributed Processing Symposium

Publisher: IEEE

Publication Date: 2009

Publication Name: 2009 IEEE International Symposium on Parallel & Distributed Processing

Publisher: IEEE

Publication Date: 2011

Publication Name: 2011 International Conference on Parallel Processing

Research Interests: Computer Science and Congestion Management<div>()</div>

Publication Date: 2004

Research Interests:
Computer Science, Chip, Interconnection, and Network on a Chip

Research Interests:
Distributed Computing, Quality of Service, IEEE, IEEE Communications Magazine, Electrical And Electronic Engineering, and performance guarantee

Research Interests:
Computer Science, System Architectures, Data Center, Data Centers, Utility Computing, and Computer

Research Interests:
Distributed Computing and Computer Software

Research Interests:
Computer Science

Research Interests:
Computer Science

Research Interests:
Computer Science and InfiniBand

Research Interests:
Computer Science and Complex network

Research Interests:
Computer Science and Fault Tolerant Routing

Research Interests:
Computer Networks, Network Dynamics, Fault Localization, and Fault Tolerant

Research Interests:
Parallel, Network Topology, and Fault Tolerant

Research Interests:
Computer Science, High Performance Computing, Chip, and Network on chip

Research Interests:
Distributed Computing, Architecture, and Parallel & Distributed Computing

Research Interests:
Computer Science, Computer Architecture, and Resource Allocation

Research Interests:
Computer Science and Congestion Management

Research Interests:
High Performance Computing, Quality of Service, Resource Reservation, and Admission Control

Research Interests:
Computer Science and Boolean Satisfiability

Research Interests:
Computer Science, Distributed Computing, High Performance Computing, and Parallel & Distributed Computing

Research Interests:
Computer Science and Chip

Research Interests:
Fault Tolerant Routing

Research Interests:
Large Scale and Input Output

Research Interests:
Flow Control, Communication System, High Speed, and Admission Control

Research Interests:
Computer Science and arXiv

Research Interests:
Distributed Computing and Supercomputing