Optimal resource assignment through negotiation in a multi-agent manufacturing system

2000, IIE Transactions

Research studies on multi-agent systems have been recently boosted by manufacturing and logistics with deep motivations like the presence of independent human deciders with individual goals, the aspiration to dominate the complexity of decision-making in large organizations, the simplicity and robustness of self-reacting distributed systems. After a survey of the multi-agent paradigm and its applications, the paper introduces the notion of hybrid holonic system to study the effect of supervision on a system whose elements negotiate and cooperate in a rule-settled environment to obtain resources for system operation. The supervisor can spur or disincentive agents by assigning/denying resources to them. A simple single-decider optimization model referred to a real application is described, and solution methodologies for optimal resource allocation fitting different scenarios (centralized, distributed, multi-agent) are discussed, identifying ranges of autonomy, quantifying rewarding and defining a negotiation protocol between the agents and the supervisor. Aim of the paper is to describe through an example a general methodology for quantitative decision-making in multi-agent organizations.

IIE Transactions (2000) 32, 963±974 Optimal resource assignment through negotiation in a multi-agent manufacturing system CLAUDIO ARBIB and FABRIZIO ROSSI Dipartimento di Matematica Pura e Applicata, UniversitaÁ di L'Aquila, Via Vetoio, Coppito, I-67010 L'Aquila, Italy E-mail: arbib@univaq.it or rossi@univaq.it Received April 1999 and accepted December 1999 Research studies on multi-agent systems have been recently boosted by manufacturing and logistics with deep motivations like the presence of independent human deciders with individual goals, the aspiration to dominate the complexity of decision-making in large organizations, the simplicity and robustness of self-reacting distributed systems. After a survey of the multi-agent paradigm and its applications, the paper introduces the notion of hybrid holonic system to study the eect of supervision on a system whose elements negotiate and cooperate in a rule-settled environment to obtain resources for system operation. The supervisor can spur or disincentive agents by assigning/denying resources to them. A simple single-decider optimization model referred to a real application is described, and solution methodologies for optimal resource allocation ®tting dierent scenarios (centralized, distributed, multi-agent) are discussed, identifying ranges of autonomy, quantifying rewarding and de®ning a negotiation protocol between the agents and the supervisor. Aim of the paper is to describe through an example a general methodology for quantitative decision-making in multi-agent organizations. 1. Introduction A holonic (or multi-agent) system is a ®nite set of bounded-rational, individually operating elements called agents, able to coordinate their actions through cooperation and competition in an environment settled by rules. Agents are generally characterized as entities provided with sensors, limited memory, computational capabilties and eectors. In practice, their behavior consists of autonomous actions such as: collecting data from the environment, elaborating information, evaluating scenarios and applying strategies to reach their own objectives. Holonic systems have been receiving an increasing amount of attention since their ®rst appearance (Koestler, 1967). Their applications range from biology to arti®cial intelligence, computer science, logistics and, lately, manufacturing. The multi-agent paradigm is easily recognizable in such areas as parallel computing, object-oriented programming, database design. However, mathematical models describing its features have primarily been developed in biology and arti®cial intelligence, typically using game theory and (fuzzy, multi-value) logic. Manufacturing and logistics are currently providing the motivation to perform research on multi-agent systems. As a matter of fact, besides, and sometimes despite, Computer Integrated Manufacturing, MRP and the related centralized facilities for quantitative decision0740-817X Ó 2000 ``IIE'' making, many decisions concerning shop ¯oor activities or vehicle routing (to quote a couple of examples) are still independently made by human operators who try to pursue local goals suggested by their own experience in order to cope with partially-planned or unplanned events. Relying on autonomous agents is often viewed as a way to dominate the complexity of decision-making in large organizations. This practice has enjoyed great success in particular areas, for instance in the North-East Italian textile industry, where, allied to the adoption of ISO-9000, outsourcing has been extended to most manufacturing activities: in most cases, these are in fact committed to outside contractors who are completely autonomous in decision making, so that corporations can concentrate on high-level functions such as R&D, design and trading. In this scenario, the corporation allots the jobs to a market scattered over a geographic area, providing each contractor with assembly plans and semi®nite parts, and negotiating with it due dates and compensation. The contractor then has complete freedom of choice about production strategies. In particular, depending on such factors as due dates, present workload, resource availability, client priorities etc., it can negotiate the commitment or acceptance of jobs with other contractors ensuring the same quality standard. Of course, small contractors do not usually have a vision of all the possible strategies; furthermore, they 964 generally deploy little planning/computing ability: as a consequence, their behavior is likely to be found wanting from the viewpoint of optimality. Nevertheless, this behavior seems de®nitively competitive with that of a centralized system with respect to crucial issues like: (i) robustness; (ii) modularity; and (iii) simplicity. Moreover, in the light of the lessons learnt from JIT practice, agents are better targeted at the production process rather than management functions. Finally, in a well-designed environment, it might be simply myopic to shift the capabilities of agents to activate resources and perform evaluation steps to a decision level which is higher but in fact poorer in terms of knowledge of processes and procedures. Suitable bene®t-to-cost analyses are of course necessary to evaluate to what extent it is convenient to adopt a holonic paradigm in a manufacturing setting. As reported below in some detail, several examples of tradeos have been pointed out in the literature: the extreme options are either spending on a centralized, full-information system able to optimize decisions and processes to ensure fault recovery, or de-allocating control as much as possible. The costs entailed by the former option are mainly due to computation, software development and maintenance, complexity handling and rigidity; the major costs of the latter are on the other hand are basically due to communication congestion and sub-optimality. In between these extremes, however, there is a whole range of intermediate multi-agent solutions mantaining some centralized control under the form, perhaps, of spur and/or rule setting. In this paper, after a brief literature survey in Section 2, hybrid systems of this sort are formally introduced and, with the help of an application, they are analysed in Sections 3 and 4. In particular, in an optimal resource allocation problem is de®ned and solved with both a centralized (in Section 4.1) and a distributed approach (Section 4.2). The dual analysis of the problem (Section 4.3) leads to a natural de®nition of a spur system that operates under a negotiation protocol between the agents and the supervisor. Some computational aspects are discussed in Section 5. Remarks on methodology (Section 6) conclude the paper. 2. A literature survey 2.1. Multi-agent systems in arti®cial intelligence and biology A large part of the literature on multi-agent systems deals with simulating and analyzing the behavior of such systems under particular assumptions and solicitations. A contribution to these studies has recently appeared in the form of a special issue of Arti®cial Intelligence devoted to the economic principles of multi-agent systems (Boutilier et al., 1997). Arbib and Rossi Several papers within this special issue are concerned with behavior analysis, either from the viewpoint of logical models of agents operating as qualitative decisionmakers (Brafman and Tennenholz, 1997; Poole, 1997), or looking at the mechanisms of coalition forming in a bounded-rational context which includes the costs of computing a strategy (Sandholm and Lesser, 1997), or discussing negotiation and cooperation issues (Kraus, 1997), or describing the system as a game with incomplete information (Koller and Pfeer, 1997). An important role in these descriptions is played by rule setting. Rules are de®ned as restrictions on agents operation, and include both rational laws and social conventions. Shoham and Tennenholz (1997) observe the behavior update under the introduction of conventions, but do not consider the possibility of deciding which rules are more suitable to control the system when a supervisor or controller aims at particular objectives. In fact, supervision is barely considered, apart from the information one can extract from computer simulation. Some contributions in this direction are given by Kraus (1997) who suggests the use of ``operations research techniques'' to tackle particular coalition planning problems ± formulated in terms of set partitioning ± and quotes examples from manufacturing and shop-¯oor control (Balasubramanian and Norrie, 1995), cooperative shipping companies (Fischer et al., 1995), and distributed computing (Malone et al., 1988). The paradigm of distributed computing is often used to describe holonic systems, see for instance Talkudar et al. (1998). This paradigm actually well illustrates the tradeo between cooperation and autonomy: a processor making information on its state available to other processors can in fact improve computation eciency at the price of communication overhead. A similar kind of trade-o arises when the computational costs of strategies are taken into account: see Sandholm and Lesser (1997) who describe an experiment on a vehicle-routing model in which every solution has a computation and an implementation cost, and the smaller the former, the larger the latter (see also Section 2.2). 2.2. Multi-agent models in manufacturing and logistics In the area of logistics, as just recalled, distributed vehicle routing is considered among others by Sandholm and Lesser (1997) who discuss a setting with self-interested agents dealing with an intractable combinatorial problem that, being practically unsolvable, acts as a limit on the agents' rationality. In the case considered, several dispatch centers receive delivery orders and route a ¯eet of vehicles over a geographic region. Fleet and region are partitioned into pools and operation areas, respectively: each pool directly responds to a dispatch center, which is modeled as an agent responsible for a set of deliveries; on the other hand, each agent covers a Optimal resource assignment through negotiation in a multi-agent manufacturing system particular operation area, but distinct areas may overlap, thus creating the potential for cooperation and competition. Each agent can use some computational resource to solve the routing problem as best it can: the solution quality (and therefore the implementation cost) depends on how much computational resource the agent allocates to the solution algorithm, which, in the simulation experiments, is assumed to be the same for all the agents; since allocating resources is costly, a trade-o emerges. Early work on holonic models applied to the area of manufacturing, was performed by Malone and Smith (1988) and by Shaw (1989). These two contributions highlight that the key issue in this area is negotiation. A ®rst step towards modeling this aspect is taken with the de®nition of mechanisms (in practice, communication protocols) allowing the agents to negotiate: Shaw (1989) identi®es in fact a ``seller'' and a ``buyer'' in any pair of negotiating agents and de®nes the rules under which transactions are to be executed. On the other hand, Malone and Smith (1988) concentrate on how to compare the performance of coordination structures appearing in dierent kinds (human, computers, etc.) of multi-agent organizations. Lin and Solberg (1992) present a framework for integrating shop-¯oor control using autonomous agents. Ramos and Sousa (1996) analyse dynamic scheduling within the framework of negotiation protocols. In general, negotiation can be classi®ed according to whether it occurs: · between homogeneous agents e.g., servers competing for a client, (Sandholm and Lesser, 1997); · between non-homogeneous agents e.g., a server and a client assessing the right price for a service, (Adacher et al., 1999); · between the agents and a supervisor, as in the present contribution. It is in our opinion worth noticing that, when trying to state what is and what is not holonic in manufacturing, many authors agree that most scheduling systems operating under real-time dispatching policies (i.e., the large majority of practical applications), though generally simpler and more ¯exible than those in which an optimal operation scheduling is tried on the basis of the information collected over some planning horizon, should actually not be regarded as holonic; because: (i) the dispatching of resources and workloads to the machining stations is generally unappealable by the station responsibles; and (ii) no kind of job negotiation is normally allowed between distinct stations. This viewpoint is explicitly adopted in a recent contribution on autonomous agents by Adacher et al. (1999). Here a simulation model is devized with the aim of analyzing the eectiveness of dierent control architectures in a ¯exible ¯ow shop with features typical of the multiagent paradigm. The indications coming from the study 965 positively stress large degrees of autonomy, especially in scenarios strongly characterized by negotiation and cooperation. In conclusion, resource assignment and negotiation appear fundamental features of multi-agent systems. This motivates the focus on the resource-assignment problems described in the remainder of the paper. 3. The case study: system architecture We investigate the problem of assigning resources to workstations in a ¯ow line devoted to the manufacture of electronic components. The model proposed is based on a high-level description of the system architecture, close to the one described in Leung et al. (1997). In fact, even if the problem arises in a speci®c production context, both the problem and the model appear suciently general to be regarded as representative of dierent manufacturing environments. The line is made of q 50 workstations wi , i 1; . . . ; q, and is served by a carrier system organized into p 30 intersecting tracks (see Fig. 1). According to group technology, the kth carrier moves back and forth along the kth track and is therefore able to serve only a subset fwik ; wik 1 ; . . . ; wik qk ÿ1 g of consecutive workstations. Parts are loaded and unloaded by carrier k from/to a distinct L/UL position uk . The execution of a job requires the concurrent availability of a workstation and a non-specialized resource (in our case, a worker). A job loaded at uk can be split into a number of lots, and each lot can be sent for operation to a distinct workstation among those directly accessible from uk . In the case on hand, job splitting is allowed in order to keep the job duration suciently Fig. 1. Schematic layout of a ¯ow line with part feeders, and information system architecture. 966 short, making parallel use of more workstations. In fact, in our application, the workstations perform a chemical treatment of the parts, and a short simultaneous treatment guarantees a uniform level of quality of the products. Generally speaking, and whenever set-up costs can be neglected, job splitting has also the advantage of minimizing the risk of a complete job cancellation in case of machine failure. Workers have to be assigned to workstations so as to ful®l the manufacturing requirements of a speci®ed planning horizon. These requirements are generally expressed in terms of the number of items to be produced per part type, but after job splitting are translated into a list of resources needed per part type. On this basis, and according to a preliminary assessment on resource availability, each job is partitioned o-line into lots, each one requiring a workstation and a worker. Lots are then loaded from the relevant L/UL position and allocated to the machines that the carrier can reach from that station. At present, a computer-aided planning system handles the assignment of workers to workstations on the basis of requests from the L/UL positions. Resources are assigned in order to guarantee operation feasibility, but disregarding any optimization of resource utilization. Basically, the planning system collects data about the state of the workstations by accessing to a memory device MEM shared among workstations (see Fig. 1), where the relevant pieces of information are coded in records whose format is described below: Record: <machine_id, load_span> <machine_id>: is the name of the workstation that writes on MEM <load_span>: is a pair (i, j), i < j, expressing the range of L/U stations indexes served by <machine_id> Arbib and Rossi central supervision in order to ful®l a given demand of service. 4.1. A centralized approach Following Shaw (1989), at the beginning of the planning horizon (time t) we focus on a set Ct of p clients u1 ; . . . ; up (the L/UL positions) and a set St of q servers (the workstations) w1 ; . . . ; wq . Let then dt denote a real non-negative p vector dt1 ; . . . ; dtp representing the demand expressed by each client of Ct : in practice, dtj represents the number of workers required during period t; t Dt by job j. For each element e 2 Ct [ St and for any t, de®ne the neighbors Nt e as the set of those elements which, at time t, element e 2 Ct (2 St ) can resort to for asking (oering) the execution of a job. In the following, we will usually represent this relation by means of a bipartite graph Gt Ct ; St ; Et (Fig. 2). The jobs of each uk 2 Ct can be assigned only to those servers in Nt uk which are provided with enough resources. Consider then a decision model in which a 0±1 variable xi is associated with each server wi of St . In particular, set: n 1 if wi is assigned a resource, xi 0 otherwise. De®ne then feasible a resource assignment x such that each client uk 2 Ct has at least dtk servers holding a resource in its neighborhood, i.e., an x such that: X xi P dtk 8 uk 2 Ct : 1 wi 2Nt uk In a centralized approach, a supervisor can evaluate the cost ci involved in assigning a resource to server wi . Resource assignment is then performed on a ®rst-in ®rstout basis, comparing the current resource allocation to each incoming request, and completing the assignment whenever necessary. 4. The case study: optimal resource allocation In this section we describe a model for optimizing the assignment of resources in the system. We analyze the problem under three speci®c respects. In the ®rst (Section 4.1) we consider a centralized approach aimed at ful®ling the manufacturing requirements while minimizing the total cost of resources. In the second (Section 4.2) we develop primal-dual heuristics to solve the problem in a distributed fashion. In the third (Section 4.3) we identify for each workstation a range of possible autonomous decisions, and then model the workstations as autonomous agents that compete and/or cooperate under Fig. 2. The graph Gt and the matrix E associated with the production line of Fig. 1. 967 Optimal resource assignment through negotiation in a multi-agent manufacturing system Such an evaluation may require some additional eort, since it should relate the cost of the resource assigned to the eciency of the service provided by the workstation. For instance, a workstation serving two L/UL positions will in general be more ecient (e.g., in terms of jobs waiting for completion) than one serving three stations. We will not go here into details on how the costs are computed, but will deal with this topic in Section 4.3. Minimizing the cost of a feasible assignment corresponds to solving the following generalized covering problem: min cx; 2 · N f0; 1; . . . ; pg; · if workstation wi serves uh1 ; . . . ; uk , then rs 2 E for all r and s such that hOr < sOk (in other words, Qi fh; h 1; . . . ; kg is a complete subgraph of D); · arc hk 2 A is weighted by chk minQi 3hk fci g. For example, consider the production line of Fig. 1, and suppose c 3; 4; 2; 6; 5; 1. Figure 3 shows the corresponding graph D: the complete subgraphs associated with the workstations are Q1 f0; 1g, Q2 f0; 1; 2g, Q3 f0; 1; 2; 3g, Q4 f1; 2; 3; 4g, Q5 f2; 3; 4g, Q6 f3; 4g. In particular, since c1 < c2 , one has c01 3. Let c k denote the minimum cost of resources needed to serve clients uk ; uk1 ; . . . ; up . Then the following theorem holds: Ex P dt ; 0 O x O 1; Theorem 1. Problem (2) can be solved by computing c 0 through the following recursive formula: x integer; where E is the (bipartite) adjacency matrix of graph Gt . Due to the particular plant layout, the following property can be observed: Proposition 1. Matrix E is totally unimodular. (Recall that a real matrix A is said to be totally unimodular if the determinant D of every square submatrix of A is integral and such that jDjO1). In fact, the rows and columns of E can be permuted so that E exhibits the ``consecutive ones property'' (Fig. 2). By a well-known result due to Homann and Kruskal (Nemhauser et al., 1989) it then follows that program (2) can be solved in polynomial time, and in particular by reformulating the problem in terms of minimum cost ¯ow, see Ahuja et al. (1993). 4.2. A distributed primal-dual heuristic for minimum-cost resource assignment Observe ®rst that not only E enjoys the consecutive one's property, but it can also be permuted so that, for any two rows (columns) eh , ek , if h < k then c p 0; c h minfchk ck g; hk2A 3 0 O h < p: Proof. In fact, Equation (3) returns a path p of minimum cost on D from node zero to node p. Since arc hk 2 p biunivoquely corresponds to a set of columns which are individually able to cover the components from h 1 to k of the right hand side of Ex P 1, the arcs of p completely de®ne a feasible solution of (2). Since each arc weight equals the minimum cost among those of the corresponding columns, the thesis follows. j A distributed implementation of Equation (3) can be achieved by adding two ®elds to each record put on MEM. The modi®ed record structure is as follows: Record: <machine_id, clients_span, cost, successor> <machine_id>: is the name of the workstation that writes on MEM <load_span>: is a pair (h,k), h<k, expressing the range of L/U stations indexes served by <machine_id> · both the leftmost and the rightmost 1 of row eh have a column index O than the leftmost and rightmost 1 of row ek ; · both the upmost and the downmost 1 of column eh have a row index O than the upmost and the downmost 1 of column ek . Due to this particular structure, problem (2) can in some cases be easily solved by distributed computing. Suppose, for instance, dt 1. In this case, by virtue of the particular ranking of columns in E, we can associate with any problem instance a directed acyclic graph D N ; A, where Fig. 3. The graph D associated with the line of Fig. 1, and the costs of resource assignment. 968 Arbib and Rossi <cost>: is the minimum cost of serving L/U stations from uh to up <successor> is the workstation chosen by machine_id to execute uk1 in an optimal solution involving L/U stations uh ; . . . ; up . to suboptimal solutions. Consider in fact the following simple example: Updating records according to Equation (3), an optimal solution of problem (2) can be obtained. Record update can be done asynchronously. Let wi be able to cover clients from uh to uk . As soon as wi identi®es a set S of workstations able to cover uk1 , it simply chooses as successor the wj 2 S with smallest cost, and then modi®es its own record by computing cost according to (3) and setting successor : wj . To illustrate the method, let us give a simple example. subject to: Example 1. Consider the line of Fig. 1 and assume c as before. Now and then, each workstation accesses MEM to check whether a workstation is able to serve the clients it cannot serve in the rightmost portion of the line. If no such workstation is available, a request message is left in MEM. After some time, workstations w4 , w5 and w6 recognize that they are candidates to serve the last client in the line (i.e., u4 ), and therefore send to MEM the following records: < w4 ; 2; 4; 6; ; >, < w5 ; 3; 4; 5; ; >, and < w6 ; 4; 4; 1; ; >. As soon as the remaining workstations realize that somebody is able to serve the next client in the line, they update their records: for instance, w3 , the span of which is u2 ; u3 , sends to MEM a record like < w3 ; 2; 3; 1 2; w6 >, since among w4 , w5 and w6 the minimum cost is achieved at w6 . Similarly, w2 , the span of which is u1 ; u2 , sends to MEM a record like < w2 ; 2; 3; 3 4; w3 >, since among w4 , w5 and w3 the minimum cost is achieved at w3 . The computation is completed as soon as w1 sends to MEM its record, which turns out to be < w1 ; 1; 1; 3 3; w3 >. The optimal solution can be read from MEM by scanning the list of successors from a record with minimum cost among those having u1 in their span. In the example described, the solution involves w1 , w3 , w6 : the corresponding columns of E de®ne in fact a cover of minimum cost. This distributed way of computing c 0 has the following advantages: where client u4 requires at least two workstations provided with a resource within its neighbors. A ®rst optimal solution assigning one resource per client activates workstations w1 and w7 . This solution has a cost of six, but is infeasible. To satisfy client u4 one can erase columns 1 and 7, set dt 0; 0; 0; 1; 0 (i.e., erase rows 1, 2, 3 and 5) and iterate. The client is then covered at a minimum cost by setting x6 or x8 to one: the resulting cost is eight. However, x1 x6 x8 1 would equally satisfy the demand at a cost of seven. In general, the proposed iterative procedure returns a resource assignment of reasonable cost. A lower quality of solutions ± with respect to the centralized approach ± is balanced by the fact that, unlike the centralized one, this method operates on data continuously matching the real state of the line. Moreover, the quality of the solution can be certi®ed by a dual heuristic, which, through a similar mechanism, attempts to solve the dual of problem (2): · there is for need for supervision to compute an optimal assignment of resources; · the information in MEM always matches the current status of the line: if, for instance, a workstation or a carrier goes down, the relevant information is asynchronously updated by the workstation itself, and the resource request modi®ed accordingly. However, such a computation cannot in general be carried out with dt 6 1. To extend the approach to different demand vectors one might try to repeatedly apply the procedure, assigning one resource at a time, to each client, until all clients are satis®ed. However, this can lead Example 2. Let problem (2) have the following aspect min 3x1 x2 2x3 2x4 x5 2x6 3x7 2x8 ; x1 P 1; x1 x2 x3 x4 P 1; x3 x4 x5 x6 x7 P 1; x6 x7 x8 P 2; x7 x8 P 1; 0 O xi O 1 i 1; . . . ; 8; max ydt ÿ z 1; 4 yE ÿ z O c; y; z P 0: Similarly to the primal heuristic, the dual heuristic proceeds by repeatedly solving a problem of the form max ydt ÿ z 1; 5 yE ÿ z O 1; y; z P 0; by dynamic programming, i.e., as a longest path on a suitable directed acyclic digraph. If the duality gap obtained is zero, then the primal solution is optimal. Thus, though the dual heuristic does not give direct information on resource assignment, it can be used as a measure of the quality of the primal solutions produced. In practice, in a work shift simulation of the system currently analysed, the dual heuristic certi®ed as being optimal about 80% of the solutions computed by the primal heuristic. Optimal resource assignment through negotiation in a multi-agent manufacturing system 4.3. A multi-agent approach As observed in Section 2, distributing the computation does not necessarily imply an autonomous behavior of agents. In fact, in the case described in Section 4.2 there is no decision variable whose setting is under the server control: simply, each server puts data onto a shared memory device, and gets out from it a feasible ± possibly optimal ± resource allocation. However, the idea of separating the solution of the dual master problem from pricing naturally leads to a negotiation between each agent ± viewed as a ``seller'' ± and the supervisor ± the ``buyer''. Suppose in fact to grant each server wi certain freedom of selecting potential clients out of Nt wi . In this way, graph Gt (and consequently matrix E and any optimal resource assignment) is subject to vary in time depending upon the servers choices and not only on failures and system recon®gurations. ~ 2 f0; 1gpm be a matrix whose columns represent Let E client sets chosen as feasible alternatives for service con~ is partitioned into q blocks E ~ i with ®guration. Matrix E h p mi entries each, and the columns Ei , h 1; . . . ; mi , of the ith block represent feasible alternatives for wi (see Fig. 4, showing a set of feasible service con®gurations associated with each workstation). Let then xhi be a 0±1 variable under supervisor control, indicating that during the planning horizon (e.g., a work shift) wi will use the hth alternative of service. A choice minimizing the total cost of the resources used can then be found by the following linear program min q X mi X chi xhi ; 6 i1 h1 ~ P dt ; Ex mi X xhi O 1 i 1; . . . ; q; h1 x P 0; integer; chi where is the cost of activating for a work shift the hth alternative of service among those oered by server wi . In a centralized approach, both Eih and chi are directly evaluated by the supervisor (see Section 4.1) taking into account the cost of resources and the quality of service. In a hybrid multi-agent context like the one outlined in Section 1, the con®gurations of service Eih and the costs chi are instead directly evaluated (and payed) by the servers (agents). In this case, no decision under the agents control can be directly modi®ed by a supervisor prescription, though the latter can use spur or disincentive to drive the decision towards the optimization of certain high-level parameters. So, if the current optimum x is not satisfactory, the supervisor can try to improve it by stimulating the agents to propose new con®gurations of service. A spur in this direction can be achieved by oering, via 969 the publication on MEM, a ``reward'' for any client included by the agents into a new service con®guration. A negotiation mechanism is then triggered, in which servers ®rst evaluate the oer, then compare it to their own capability of serving the clients and the costs this decision would involve, and eventually, if convenient, reply with a proposal. This proposal will include the set of clients served and the service cost. Similarly, the supervisor evaluates the proposals by updating and solving program (6), and then, if the case, replies with a new set of incentives. An obvious question to answer is how rewards should be computed. Consider the dual of the linear relaxation of problem (6) max yd ÿ z 1; ÿzi X yj O chi 7 i 1; . . . ; q; h 1; . . . ; mi ; k2Eih y; z P 0; ~, y ~; ~z y~1 ; . . . ; y~p ; ~z1 ; . . . ; ~zq be optimal soluand let x tions of (6) and (7). Then, take y~k as the reward oered to any agent who is going to serve uk , and ~zi as a disincentive to wi to change con®guration. P By complementary slackh h ~ xi > dk (whenever ness, y 0 (~ z 0) whenever i k i;h Eki ~ Pmi h ~ x < 1), that is, no reward is oered to the agents i h1 for adding client uk if the current service con®guration is oversized for it, and ~zi may be provided as a disincentive to any agent wi which has already been assigned a con®guration. Now, any service con®guration Eih of wi producing a negative reduced cost, i.e., such that X 8 y~k > chi ~zi ; k2Eih ~. So, the would improve the current primal optimum x local objective of server wi naturally becomes X y~k ÿ ~zi chi ; 9 max h1;...;mi k2Eih namely, moving towards a client set that, through the corresponding rewards y~k , ensures a sucient increment on the current disincentive ~zi and the cost chi . Summarizing, negotiation basically operates under a protocol ensuring the realization of the following phases (Fig. 5): 1. Each agent is noti®ed by the supervisor of an individual disincentive (= q ith component of the dual optimum) to change con®guration. On the other hand, a reward (=kth component of the dual optimum) is initially oered for serving the kth client. 2. A set of proposals (=columns) and the corresponding costs are then produced by the agents on the basis of disincentives, rewards and other issues, 970 Arbib and Rossi Fig. 4. Feasible service con®gurations for a production line. published in MEM and in turn read by the supervisor ~ (=added to the current matrix E). 3. In case of acceptance of one or more proposals (=columns entering the basis after pivoting on the whole column set), the supervisor will reconsider the solution quality: if low, it will produce a new reward vector ~ y; otherwise, it will eventually arrange resources according to the current primal optimum. To see how the method works, consider the following simple example. Example 3. Let the system consist of two workstations (w1 , w2 ) and three L/UL positions (u1 , u2 , u3 ): w1 can be accessed from any of the three L/UL positions, whereas w2 can be accessed only from u2 and u3 (see Fig. 6). Suppose that at time t there are four lots a; b; c; d waiting at the L/UL positions, and that a is queued at u1 , b at u2 , and c; d at u3 . Assume all the processing times of the lots are identical (say unit), but that a requires a dierent treatment from the others: hence, a set-up will be incurred whenever a is processed immediately after, or before, any other lot. This set-up requires two time units. Finally, some time is spent in accessing the workstations from the L/UL positions: suppose that half a time unit is needed to reach position u1 from u2 , and position u2 from u3 (thus u1 is reachable from u3 within one time unit); at the end of each mission, the carrier stops by the workstation, and the time needed to load/unload the lot from/to the workstations is negligible. A worker must stand by the workstation from time t to the end of the last process. Thus, the cost of a service con®guration can be thought as the sum of two terms: one is a ®xed cost c0 , the other is proportional, according to some factor c, to the time spent by the worker at the workstation. This time clearly depends on how the worker decides to schedule the lots, and the supervisor may not know this in advance. Other terms of the cost, such as the cost of the chemical compound used for the treatment and loaded at each set-up, are neglected. Fig. 5. Information ¯ows in cooperative evaluation of spur (light grey) and negotiation (grey±dark grey). Optimal resource assignment through negotiation in a multi-agent manufacturing system 971 c22 $6 and rewards $7. As for w1 , the best timing of (1, 0, 1) is 1: maxfsetup c; move cg; duration 2 2: maxfexecute c; move dg; duration 2 2: maxfexecute d; move ag; duration 1 3: setup a; duration 2 4: execute a; duration 1 Fig. 6. The system of Example 3. for a total duration of eight time units, a cost c21 $10 and a reward of $11. Instead, the activation of (0, 0, 1) implies a schedule of ®ve time units, which costs and rewards $7. This alternative is therefore not convenient for w1 . The optimal service con®gurations are in conclusion obtained by solving: min 11x11 7x12 10x21 6x22 ; Suppose that the service con®gurations initially available are 1; 1; 1 for w1 and 0; 1; 1 for w2 . To evaluate the related costs, let us examine what kind of schedules these con®gurations may reasonably induce. Suppose that both w1 and w2 need a set-up no matter what is the ®rst lot processed and, for simplicity, that unloading to the closest position is always possible in zero time units. The best w1 can do is to process the lots in the sequence c; d; b; a, parallelizing independent operations as long as possible, 1: maxfsetup c; move cg; duration 2 2: maxfexecute c; move dg; duration 2 3: maxfexecute d; move bg; duration 1 4: maxfexecute b; move ag; duration 1 5: setup a; duration 2 6: execute c; duration 1 for a total duration of nine time units. Similarly, the total duration of the best schedule of w2 is ®ve time units. Assume c0 $2 and c $1 per time unit. The corresponding costs are then c11 11 and c12 7. The only feasible solution consists of activating both the service con®gurations, for a total cost of $18. However, due to service oversizing, this is an overestimate of the real cost. So let us solve the dual max y1 y2 2y3 ; y1 y2 y3 O 11; y2 y3 O 7; y1 ; y2 ; y3 P 0: We easily obtain y1 4, y2 0, y3 7. Both w1 and w2 will then try to serve u3 ; moreover, w1 will see whether or not there is convenience in also serving u1 . The proposal of w2 is (0, 0, 1), which requires four time units, costs x11 x21 P 1; x11 x12 P 1; x11 x12 x21 x22 P 2; x11 x21 O 1; x12 x22 O 1; x P 0; integer; which assigns (1, 0, 1) to w1 and 0; 1; 1 to w2 . Observation 1. Example 3 gives an idea of the impact of scheduling on costs. Observe that some data not directly accessible to the supervisor (or maybe only accessible at a high cost) such as the current set-up of the workstation, can sensibly aect the costs and have eects on the quality and the reliability of the solution eventually found. Notice also that in the multi-agent approach resource provision is not necessarily up to the supervisor: the agents individually purchase the resources they need and then charge the related costs on the total service cost. Hence, the costs of resources have a direct impact on the agents decisions. Of course, costs and proposals, and consequently the negotiation mechanism, can be biased by various issues independently evaluated by agents, such as the distance of the clients, the agents workloads, etc. Observation 2. The discussed negotiation protocol does not imply direct negotiation among the agents: they may not even communicate with each other. Rather, it tends to let the agents naturally develop a reciprocal attitude to competition: if a service is inconvenient, an agent will accept it only if the reward is large enough to compensate possible losses; on the other hand, the mechanism itself 972 will spur the agents to do the job to the best of their ability. Notice however that direct negotiation among the agents is not excluded. For instance, a set of oers provided by distinct agents can be recombined to cover the same service demand with dierent columns, at the condition that the new oers will result in no disadvantage to the supervisor. Observation 3. The linear relaxation of program (6) has an interesting interpretation, since xhi indicates that in this case the fraction of a planning period during which wi is activated using the hth service alternative. In particular, the ®rst set of constraints ensures that every client obtains the required level of service during the work shift, no matter from which workstation: in fact, client uk can be served for some time by one workstation, and for the rest of the shift by another. Also, the same workstation can operate with dierent service con®gurations (that is, on a dierent set of clients) during distinct time spans. The only constraint to this way of operating is that the total operation time of any workstation wi be not longer than the work shift. In the considered system the overall performance roughly depends on two factors: (i) the cost of assigning resources to agents; and (ii) the system eciency in ful®lling the demand. Introducing ¯exibility by relaxing the integrality stipulations of (6) contributes to reducing the former cost; the latter factor, which is mainly related to service con®guration switches, is on the other hand neglected. This is not a problem with a more rigid formulation, like (6), implying no con®guration switch; and in our case study, where switches do not signi®cantly aect the system eciency, the ¯exible approach seems as reasonable. Clearly, this might not be the case in other situations. Observation 4. In general, the alternatives oered by the agents may destroy the total unimodularity of the initial matrix, unless some restrictions are applied. Reasonable restrictions are: · (continuity): if uh and uk are in a feasible alternative proposed by a server, then ur is also in that alternative for all r between h and k; · (non-overtake): if server wi precedes server wj in the line, then for every alternative fuh ; . . . ; uk g chosen by wi and fur ; . . . ; us g chosen by wj one always has h O r and k O s. Under these restrictions we can de®ne the agents range of autonomy so as to ensure the preservation of a suitable column ranking in the updated matrix; then, due to the particular objective adopted in (6), an exact solution of the dual (7) can be found in a distributed fashion by the agents through a computation quite similar to that described in Section 4.2. This aspect points out that the evaluation phase carried out by each agent at the begin- Arbib and Rossi ning of the negotiation protocol has the nature of a cooperation process among agents, as depicted in Fig. 5. Clearly, total unimodularity is no longer relevant if the linear relaxation of (6) is adopted. 5. Extensions Let J be a set of jobs and R a set of resources, and let E indicate a positive real matrix with rows corresponding to jobs and columns to resources. Let also d denote a required level of service, where dj represents the resource level required by job j. Similarly to Section 4.1, we can de®ne the problem of activating resources at a minimum cost by introducing 0±1 variables xi for each resource i 2 R, such that xi 1 if and only if resource i is activated: min cx; 10 Ex d; 0 O x O 1; x integer: In a multi-agent environment in which resources and jobs are modeled as autonomous agents, a resource-agent (resp., a job-agent) may be regarded as a ``seller'' (resp., as a ``buyer''): job-agent j can obtain the required level of service from the resource-agents in its neighborhood, de®ned as the set of resource-agents i such that aji > 0. In this scenario, no centralized optimization occurs, but resource-agents individually decide whether or not to be activated, depending on the requirements coming from their neighborhood. From the buyers' perspective, being active represents a cost. In the scenario depicted, there are actually two situations originating diseconomies, that we will respectively call no-stock, and no-client: · no-stock means that the server is not activated, though its activation is necessary to ful®l a client, so the client is lost; · no-client means that the server is activated, but eventually no client requires a service from it, so a cost is borne with no return. It is clearly important for a resource-agent to realize whether a no-stock or a no-client situation will eventually occur. To this aim, integer linear programming can again be used as a tool for decision-making. In fact, a deterministic behavior will be adopted by the agent either when xi 1 for all feasible x, or when xi 0 for all feasible x; while in all the other cases the agent will run a risk of deciding to be both active and inactive. To detect which situation is going to occur, the agent simply has to check whether or not min xi max xi ; x x 11 Optimal resource assignment through negotiation in a multi-agent manufacturing system subject to the constraints of (10). In particular, i will certainly be active if minx xi maxx xi 1, whereas it will be inactive if minx xi maxx xi 0. If, on the other hand, minx xi < maxx xi , then the decision will be nondeterministic. Generally speaking, the problems introduced are NPhard, unless speci®c properties of matrix E hold, as observed in Section 4. A situation of particular interest is when E expresses the adjacency of nodes in a particular graph Gt representing the system structure. We now derive a class of graphs that properly includes the layout of the case study analysed in Section 3 and guarantees an ecient solution of the problem. Let us make the following assumption on the structure of the neighborhood of Ct . Assumption 1. The elements of both Ct and St can be ordered so that, for any pair ai , aj of clients (servers) adjacent to the same server (client) every agent ak with i O k O j is also adjacent to that server (client) (the order can also be cyclical). Then, one can prove the following Theorem 2. Under Assumption 1, the resource allocation problem can be solved by linear programming. Proof. If the order of Ct and St is not cyclical, then E has the consecutive ones property, and therefore is totally unimodular. Suppose now that Assumption 1 holds under a cyclical order. Since Gt is bipartite, all of its cycles are even. Let us distinguish two cases: (i) no cycle of Gt has 2k nodes with k odd; and (ii) there exists at least one cycle W of Gt with 2k nodes and k odd. In case (i), E is totally unimodular, whereas in case (ii) it is not. In the latter case, one can however remove from Gt a pair of nodes of W \ St (and the corresponding columns from E) so that the resulting is totally unimodular. This bipartite adjacency matrix E corresponds to ®xing two 0±1 variables: the solution of the problem is then obtained from that of four distinct linear programs. j 6. Conclusions In this paper we discuss a methodology for the optimal allocation of resources to a manufacturing system in a multi-agent environment. We use a case study to show that quantitative decision-making can be implemented by the agents using a new approach. The approach introduces a spur system based on dual pricing to stimulate agents to propose alternative service con®gurations in order to improve the current resource allocation established at the supervisor level. A basic negotiation protocol is de®ned, which involves both resource bidding and agents cooperation. The main dierence between the 973 proposed approach and the existing literature is the use of the mathematical properties of the model to guarantee or approximate an optimal behavior of the agents with respect to both local and global objectives. It turns out that, under some respects, the holonic approach is more pro®table than a centralized one: in particular, though the distribution of the computation among the agents may cause solution suboptimality, the increased level of ¯exibility may correspond to the relaxation of constraints among which integrality clauses, hence resulting in a global improvement in both computational eciency and solution. Further research is needed to verify the potential of the described methodology when applied to other settings and problems. A quite promising area for a multi-agent approach appears that of operation scheduling, as far as features like robustness and ¯exibility are concerned, see also Lucertini et al. (1999). For instance, Dantzig±Wolfe decomposition proved to be a useful technique to tackle scheduling problems with additive due-date related objectives on unrelated parallel machines (Arbib et al., 1999 and Chen and Powell, 1999). Here, generating a column corresponds to ®nding a single-machine schedule which maximizes a local objective. This may pre®gure a holonic approach in which machines autonomously use local computational resources in a framework similar to that depicted in Section 4.3. Acknowledgement The authors deeply acknowledge Stefano Smriglio for stimulating discussions and ideas, and express their gratitude to an anonymous referee for the constructive comments. References Adacher, L., Agnetis, A. and Meloni, C. (1999) Autonomous agents architecture and algorithms in ¯exible manufacturing systems. Tech. Rep. 43±99, Dipartimento di Informatica e Automazione, UniversitaÁ di ``Roma Tre'', Via della Vasca Navale 79, 00146, Rome, Italy. Ahuja, R.K., Magnanti, T.L. and Orlin, J.B. (1993) Network Flows, Prentice Hall, Englewood Clis, NJ. Arbib, C., Ciaschetti, G. and Rossi, F. (1999) Distributing material ¯ows in a manufacturing system with large product mix. Two models based on column generation. Lecture notes in Economics and Mathematical Systems. New Trends in Distribution. Logistics, Springer±Verlag, Berlin, pp. 235±253. Balasubramanian, S. and Norrie, D. (1995) A multi-agent intelligent design system integrating manufacturing and shop-¯oor control, in Proceedings of the First International Conference on Multiagent Systems, pp. 3±19. Boutilier, C., Shoham, Y. and Wellman, M.P. (1997) Economic principles of multi-agent systems. Arti®cial Intelligence, 94, 1±2. Brafman, R.I. and Tennenholz, M. (1997) Modeling agents as qualitative decision makers. Arti®cial Intelligence, 94, (1±2), 217±269. 974 Chen, Z.L. and Powell, W.B. (1999) Solving parallel machine scheduling problems by column generation. INFORMS Journal on Computing, 11(1), 78±94. Fischer, K., Muller, J.P., Pischel, M. and Schier, D. (1995) A model for cooperative transportation scheduling, in Proceedings of the First International Conference on Multiagent Systems, AAAI, San Francisco, CA. pp. 109±116. Koestler, A. (1967) The Ghost in the Machine, Hutchinson and Co., London, UK. Koller, D. and Pfeer, A. (1997) Representations and solutions for game-theoretic problems. Arti®cial Intelligence, 94, (1±2), 167± 217. Kraus, S. (1997) Negotiation and cooperation in multi-agent environments. Arti®cial Intelligence, 94, (1±2), 79±99. Lin, G.Y. and Solberg, J.J. (1992) Integrated shop ¯oor control using autonomous agents, IIE Transactions, 24(3), 57±71. Leung, J., Yang, X., Mak, R. and Lam, K. (1997) Optimal cyclic multihost scheduling: a mixed integer programming approach, in Proceedings of the 5th Twente Workshop on Graphs and Combinatorial Optimization, University of Twente, Enschede, The Netherlands, pp. 170±175. Lucertini, M., NicoloÁ, F. and Smriglio, S. (1999) Assignment and sequencing of parts by autonomous workstations. Tech. Rep. n. 364, Centro V. Volterra, Via della Ricerca Scienti®ca, 00133, Rome. Malone, T.W., Fikes, R.E., Grant, K.R. and Howard, M.T. (1988) Enterprise: a marketlike task schedule for distributed computing environments, in The Ecology of Computation, Huberman, B.A. (ed.), North-Holland, pp. 177±205. Malone, T.W. and Smith, S.A. (1988) Modeling the performance of organizational structures. Operations Research 36, 3. Nemhauser, G.L., Rinooy-Kan, A.H.G. and Todd, M.J. (1989) Optimization, North Holland, Amsterdam, pp 421±436. Poole, D. (1997) The independent choice logic for modeling multiple agents under uncertainty. Arti®cial Intelligence, 94, (1±2), 139± 167. Arbib and Rossi Ramos, C. and Sousa, P. (1996) Scheduling orders in manufacturing systems using a holonic approach, in Pre-proceedings of the European Workshop on Agent-oriented Systems in Manufacturing, Berlin (D), pp 80±85. Sandholm, T.W. and Lesser, V.R. (1997) Coalitions among computationally bounded agents. Arti®cial Intelligence, 94, (1±2), 99±139. Shaw, M.J. (1989) FMS scheduling as cooperative problem solving. Annals of Operations Research, 17, 326±346. Shoham, Y. and Tennerholz, M. (1997) On the emergence of social conventions: modeling, analysis and simulations. Arti®cial Intelligence, 94, (1±2), 139±167. Talkudar, S., Baerentzen, L., Gove, A. and de Souza, P. (1998) Asynchronous teams: cooperation schemes for autonomous agents. Journal of Heuristics, 4, 295±321. Biographies Claudio Arbib received a Ph.D. in System Engineering from the University of Rome ``La Sapienza'' in 1987. From 1987 to 1992 he was an Assistant Professor of Operational Research at the Department of Electronic Engineering of the II University of Rome, and from 1992 he has been an Associate Professor of Operational Research at the Department of Pure and Applied Mathematics of the University of L'Aquila, Italy. His research interests include applications of combinatorial optimization and graph theory to industrial engineering, with particular reference to manufacturing and logistics. Fabrizio Rossi received a Ph.D. in Operational Research from the University of Rome ``La Sapienza'' in 1996. From 1997 he has been an Assistant Professor of Operational Research at the Department of Pure and Applied Mathematics of the University of L'Aquila, Italy. His research interests include applications of combinatorial optimization to industrial engineering, with particular reference to manufacturing and telecommunications.

Log In

Optimal resource assignment through negotiation in a multi-agent manufacturing system

Related papers

Related papers

Related topics