Kemal Efe

Middle East Technical University, Computer Engineering, Faculty Member

Followers

Following

Co-authors

Public Views

I am a retired professor doing research for pleasure.

less

Interests

Uploads

Papers

Parallel algorithms for optimal data allocation in a dynamic CIM network

Proceedings of the Second International Conference on Systems Integration

ABSTRACT An efficient, massively parallel optimization technique is developed for solving the dyn... more ABSTRACT An efficient, massively parallel optimization technique is developed for solving the dynamic data allocation problem in medium to large scale applications such as computer integrated manufacturing (CIM) systems. This method is based on a significantly reduced feasible state search space. A statistical evaluation framework compares the performance of the proposed technique with other dynamic data allocation strategies. Algorithms are actually implemented for a variety of I/O task activation scenarios, with the number of task activation nodes ranging from 50 to 250. The overall performance of the proposed method has a significant improvement over other optimization strategies, especially as the number of task activation nodes increases

Task scheduling with and without communication delays: A unified approach

European Journal of Operational Research, 1996

The problem of scheduling directed acyclic task graphs on an unbounded number of processors is co... more The problem of scheduling directed acyclic task graphs on an unbounded number of processors is considered. We present a single algorithm which is applicable to several special cases, thus effecting a unified approach to task scheduling independent of the task graph. We start by considering multi-stage dags and present an algorithm that computes a schedule in O(Nq log q) time,

Parallel algorithms for workstation clusters

We investigate the potential of workstation clustersfor use in high performance computation for s... more We investigate the potential of workstation clustersfor use in high performance computation for some selectedapplications. Currently, the network speed foundin most of the existing systems is quite low, but higherspeed networks are already emerging in the market.We present four parallel algorithms that performed astonishinglywell on a cluster of workstations connectedby Ethernet. Three of these are algorithms for sorting,matrix multiplication, and

Mesh-Connected Trees: A Bridge Between Grids and Meshes of Trees

The grid and the mesh of trees (or MOT) are among the best-known parallel architectures in the li... more The grid and the mesh of trees (or MOT) are among the best-known parallel architectures in the literature. Both of them enjoy efficient VLSI layouts, simplicity of topology, and a large number of parallel algorithms that can efficiently execute on them. One drawback of these architectures is that algorithms that perform best on one of them do not perform very well on the other. Thus there is a gap between the algorithmic capabilities of these two architectures. We propose a new class of parallel architectures, called the mesh-connected trees (or MCT) that can execute grid algorithms as efficiently as the grid, and MOT algorithms as efficiently as the MOT, up to a constant amount of slowdown. In particular, the MCT topology contains the MOT as a subgraph and emulates the grid via embedding with dilation 3 and congestion 2. This significant amount of computational versatility offered by the MCT comes at no additional VLSI area cost over these earlier networks. Many topological,...

Download

A Unified Approach to Algorithm Development for Product Networks

There is a major problem with algorithm portability when the user switches from one parallel arch... more There is a major problem with algorithm portability when the user switches from one parallel architecture to another. Since algorithms are usually architecture-dependent, the algorithm running on the old architecture may not run on the new one. Standard techniques, like parallelizing compilers or emulation, have efficacies far below those of algorithms specifically developed for the individual architecture. This paper proposes a two-level approach to programming parallel computers that is applicable as long as the underlying interconnection architecture can be modeled as a product network (e.g. grid, torus, hypercube, etc.). Our approach assumes that there are some low-level routines optimized for the "factor" networks comprising the product network. The set of low-level routines can be implemented as library routines. The high-level programming is then achieved, oblivious to the topology of the factor networks, by decomposing computations in a manner that only uses the se...

The Shape of the Web and Its Implications for Searching the Web

With the rapid growth of the number of web pages, designing a search engine that can retrieve hig... more With the rapid growth of the number of web pages, designing a search engine that can retrieve high quality information in response to a user query is a challenging task. Automated search engines that rely on keyword matching usually return too many low quality matches and they take a long time to run. It is argued in the literature that link-following search methods can substantially increase the search quality, provided that these methods use an accurate assumption about useful patterns in the hyperlink topology of the web. Recent work in the field has focused on detecting identi able patterns in the web graph and exploiting this information to improve the performance of search algorithms. We survey relevant work in this area and comment on the implications of these patterns for other areas such as advertisement and marketing.

Generalized Algorithm for Parallel Sorting on Product Networks

If G is a connected graph with N nodes, its r dimensional product contains N r nodes. We present ... more If G is a connected graph with N nodes, its r dimensional product contains N r nodes. We present an algorithm which sorts N r keys stored in the r- dimensional product of any graph G in O(r 2 S(N )) time where S(N ) depends on G. We show that for any graph G, S(N ) is bounded above by O(N ), establishing an upper bound of O(r 2 N ) for the time complexity of sorting N r keys on any product network. When r is fixed, this leads to the asymptotic complexity O(N ) to sort N r keys, which is optimal for several instances of product networks. There are graphs for which S(N ) = O(Log 2 N ) which leads to the asymptotic running time of O(Log 2 N ). Keywords: sorting, interconnection networks, product networks, algorithms, odd-even merge. 1 Introduction In [1], Batcher presented two efficient sorting networks. Algorithms derived from these networks have been presented for a number of different parallel architectures, like the shuffle-exchange network [10], the grid [11, 5], ...

Products of Networks With Logarithmic Diameter and Fixed Degree

This paper first presents some general properties of product networks pertinent to parallel archi... more This paper first presents some general properties of product networks pertinent to parallel architectures and then focuses on three case studies. These are products of complete binary trees, shuffle-exchange, and de Bruijn networks. It is shown that all of these are powerful architectures for parallel computation, as evidenced by their ability to efficiently emulate numerous other architectures. In particular, r-dimensional grids, and r-dimensional meshes of trees can be embedded efficiently in products of these graphs, i.e. either as a subgraph or with small constant dilation and congestion. In addition, the shuffle-exchange network can be embedded in r-dimensional product of shuffle exchange networks with dilation cost 2r and congestion cost 2. Similarly, the de Bruijn network can be embedded in r-dimensional product of de Bruijn networks with dilation cost r and congestion cost 4. Moreover, it is well known that shuffle-exchange and de Bruijn graphs can emulate the hypercu...

Download

World Academy of Science, Engineering and Technology 27 2007 A Proof for Bisection Width of Grids

Abstract—The optimal bisection width of r-dimensional N × ···×N grid is known to be N r−1 when N ... more

Robust K-Mer Partitioning for Parallel Counting

Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies, 2018

Download

The Kiwinet-Nicola Approach: Response Generation in a User-Friendly Interface

Computer, 2000

Download

Roberto Bayardo, Google, USA

computer.org, 2002

Gagan Agarwal, Ohio State University, USA Aijun An, York University, Canada Peter Andreae, Victor... more Gagan Agarwal, Ohio State University, USA Aijun An, York University, Canada Peter Andreae, Victoria University of Wellington, New Zealand Luiza Antonie, University of Alberta, Canada Chris Bailey-Kellogg, Dartmouth College, USA Arindam Banerjee, University of Minnesota, Twin Cities, USA Rohan Baxter, ATO, Australia Roberto Bayardo, Google, USA Chiranjib Bhattacharya, Indian Institute of Science, Bangalore, India Indrajit Bhattacharya, IBM Research, Delhi, India Sourav Bhowmick, Nanyang Technological University, Singapore ...

Research: Effective queueing strategies for co-scheduling in a pool of processors

Computer Communications, Aug 1, 1996

The configuration of stack filters by probabilistic search

IEEE International Symposium on Circuits and Systems, 1990

Abstract An efficient search method for the configuration of stack filters is presented. Nonlinea... more

Load balancing with network cooperation

[1991] Proceedings. 11th International Conference on Distributed Computing Systems, 1991

A detailed analytical and simulation model that accurately captures the effect of communication d... more A detailed analytical and simulation model that accurately captures the effect of communication delay for local area networks is presented. To demonstrate the framework, load sharing algorithms are presented and evaluated both with and without the effect of the communication network delay. The algorithms use the Ethernet communication protocol to their advantage and provide superior performance compared to several published

Response Generation in a User Friendly Interface

Evaluating User Effectiveness in Exploratory Search with TouchGraph Google Interface

Lecture Notes in Computer Science, 2009

Download

Performance of co-scheduling on a network of workstations

[1993] Proceedings. The 13th International Conference on Distributed Computing Systems, 1993

In a set of high performance workstations connected by a network, many workstations may be underu... more In a set of high performance workstations connected by a network, many workstations may be underutilized by their owners. While each workstation may be primarily responsible for executing its owner&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;#39;s tasks with the highest priority, the unused processing capacity may be made available to computationally intensive tasks submitted externally to the system. Static co-scheduling for such an environment has been

A reduced diameter interconnection network

Proceedings., 2nd Symposium on the Frontiers of Massively Parallel Computation, 1989

A network based on the hypercube, called the multiply twisted cube is proposed. This network pres... more A network based on the hypercube, called the multiply twisted cube is proposed. This network preserves many of the desirable properties of the hypercube, but has a diameter which is only [(n+1)/2] for an n-dimensional multiple twisted cube, a reduction of nearly 50% compared to the ordinary hypercube. Some of the basic topological properties of multiply twisted cubes are discussed,

Computational Properties of Mesh Connected Trees: Versatile Architectures for Parallel Computation

1994 International Conference on Parallel Processing-Vol 1 (ICPP'94), 1994

Abstract Recently, the mesh connected trees (MCT) network has been proposed as a possible archite... more Abstract Recently, the mesh connected trees (MCT) network has been proposed as a possible architecture for parallel computers. MCT networks are obtained by combining complete binary trees using the cross product operation. This paper focuses on structural, embedding, routing, and layout properties of the MCT networks. We show that MCT networks are computationally more powerful than grids and complete binary trees, and at least as powerful as meshes of trees (MOT). Analysis of VLSI complexity shows thai the ...