We consider a Bayesian persuasion problem where the sender tries to persuade the receiver to take... more We consider a Bayesian persuasion problem where the sender tries to persuade the receiver to take a particular action via a sequence of signals. This we model by considering multi-phase trials with different experiments conducted based on the outcomes of prior experiments. In contrast to most of the literature, we consider the problem with constraints on signals imposed on the sender. This we achieve by fixing some of the experiments in an exogenous manner; these are called determined experiments. This modeling helps us understand real-world situations where this occurs: e.g., multi-phase drug trials where the FDA determines some of the experiments, start-up acquisition by big firms where late-stage assessments are determined by the potential acquirer, multiround job interviews where the candidates signal initially by presenting their qualifications but the rest of the screening procedures are determined by the interviewer. The non-determined experiments (signals) in the multi-phase...
ACM Transactions on Economics and Computation, 2020
We study the influence maximization problem in undirected networks, specifically focusing on the ... more We study the influence maximization problem in undirected networks, specifically focusing on the independent cascade and linear threshold models. We prove APX-hardness (NP-hardness of approximation within factor (1-τ) for some constant τ > 0) for both models, which improves the previous NP-hardness lower bound for the linear threshold model. No previous hardness result was known for the independent cascade model. As part of the hardness proof, we show some natural properties of these cascades on undirected graphs. For example, we show that the expected number of infections of a seed set S is upper bounded by the size of the edge cut of S in the linear threshold model and a special case of the independent cascade model, the weighted independent cascade model. Motivated by our upper bounds, we present a suite of highly scalable local greedy heuristics for the influence maximization problem on both the linear threshold model and the weighted independent cascade model on undirected g...
Peer-prediction is a mechanism which elicits privately-held, non-variable information from self-i... more Peer-prediction is a mechanism which elicits privately-held, non-variable information from self-interested agents---formally, truth-telling is a strict Bayes Nash equilibrium of the mechanism. The original Peer-prediction mechanism suffers from two main limitations: (1) the mechanism must know the "common prior" of agents' signals; (2) additional undesirable and non-truthful equilibria exist which often have a greater expected payoff than the truth-telling equilibrium. A series of results has successfully weakened the known common prior assumption. However, the equilibrium multiplicity issue remains a challenge. In this paper, we address the above two problems. In the setting where a common prior exists but is not known to the mechanism we show (1) a general negative result applying to a large class of mechanisms showing truth-telling can never pay strictly more in expectation than a particular set of equilibria where agents collude to "relabel" the signals a...
In this work we look at opinion formation and the effects of two phenomena both of which promote ... more In this work we look at opinion formation and the effects of two phenomena both of which promote consensus between agents connected by ties: influence, agents changing their opinions to match their neighbors; and selection, agents re-wiring to connect to new agents when the existing neighbor has a different opinion. In our agent-based model, we assume that only weak ties can be rewired and strong ties do not change. The network structure as well as the opinion landscape thus co-evolve with two important parameters: the probability of influence versus selection; and the fraction of strong ties versus weak ties. Using empirical and theoretical methodologies we discovered that on a two-dimensional spatial network: • With no/low selection the presence of weak ties enables fast consensus. This conforms with the classical theory that weak ties are helpful for quicklymixing and spreading information, and strong ties alone act much more slowly. • With high selection, too many weak ties inhi...
ACM Transactions on Economics and Computation, 2019
In the setting where information cannot be verified, we propose a simple yet powerful information... more In the setting where information cannot be verified, we propose a simple yet powerful information theoretical framework—the Mutual Information Paradigm—for information elicitation mechanisms. Our framework pays every agent a measure of mutual information between her signal and a peer’s signal. We require that the mutual information measurement has the key property that any “data processing” on the two random variables will decrease the mutual information between them. We identify such information measures that generalize Shannon mutual information. Our Mutual Information Paradigm overcomes the two main challenges in information elicitation without verification: (1) how to incentivize high-quality reports and avoid agents colluding to report random or identical responses; (2) how to motivate agents who believe they are in the minority to report truthfully. Aided by the information measures, we found (1) we use the paradigm to design a family of novel mechanisms where truth-telling is...
We consider a Bayesian persuasion problem where the sender tries to persuade the receiver to take... more We consider a Bayesian persuasion problem where the sender tries to persuade the receiver to take a particular action via a sequence of signals. This we model by considering multi-phase trials with different experiments conducted based on the outcomes of prior experiments. In contrast to most of the literature, we consider the problem with constraints on signals imposed on the sender. This we achieve by fixing some of the experiments in an exogenous manner; these are called determined experiments. This modeling helps us understand real-world situations where this occurs: e.g., multi-phase drug trials where the FDA determines some of the experiments, start-up acquisition by big firms where late-stage assessments are determined by the potential acquirer, multiround job interviews where the candidates signal initially by presenting their qualifications but the rest of the screening procedures are determined by the interviewer. The non-determined experiments (signals) in the multi-phase...
ACM Transactions on Economics and Computation, 2020
We study the influence maximization problem in undirected networks, specifically focusing on the ... more We study the influence maximization problem in undirected networks, specifically focusing on the independent cascade and linear threshold models. We prove APX-hardness (NP-hardness of approximation within factor (1-τ) for some constant τ > 0) for both models, which improves the previous NP-hardness lower bound for the linear threshold model. No previous hardness result was known for the independent cascade model. As part of the hardness proof, we show some natural properties of these cascades on undirected graphs. For example, we show that the expected number of infections of a seed set S is upper bounded by the size of the edge cut of S in the linear threshold model and a special case of the independent cascade model, the weighted independent cascade model. Motivated by our upper bounds, we present a suite of highly scalable local greedy heuristics for the influence maximization problem on both the linear threshold model and the weighted independent cascade model on undirected g...
Peer-prediction is a mechanism which elicits privately-held, non-variable information from self-i... more Peer-prediction is a mechanism which elicits privately-held, non-variable information from self-interested agents---formally, truth-telling is a strict Bayes Nash equilibrium of the mechanism. The original Peer-prediction mechanism suffers from two main limitations: (1) the mechanism must know the "common prior" of agents' signals; (2) additional undesirable and non-truthful equilibria exist which often have a greater expected payoff than the truth-telling equilibrium. A series of results has successfully weakened the known common prior assumption. However, the equilibrium multiplicity issue remains a challenge. In this paper, we address the above two problems. In the setting where a common prior exists but is not known to the mechanism we show (1) a general negative result applying to a large class of mechanisms showing truth-telling can never pay strictly more in expectation than a particular set of equilibria where agents collude to "relabel" the signals a...
In this work we look at opinion formation and the effects of two phenomena both of which promote ... more In this work we look at opinion formation and the effects of two phenomena both of which promote consensus between agents connected by ties: influence, agents changing their opinions to match their neighbors; and selection, agents re-wiring to connect to new agents when the existing neighbor has a different opinion. In our agent-based model, we assume that only weak ties can be rewired and strong ties do not change. The network structure as well as the opinion landscape thus co-evolve with two important parameters: the probability of influence versus selection; and the fraction of strong ties versus weak ties. Using empirical and theoretical methodologies we discovered that on a two-dimensional spatial network: • With no/low selection the presence of weak ties enables fast consensus. This conforms with the classical theory that weak ties are helpful for quicklymixing and spreading information, and strong ties alone act much more slowly. • With high selection, too many weak ties inhi...
ACM Transactions on Economics and Computation, 2019
In the setting where information cannot be verified, we propose a simple yet powerful information... more In the setting where information cannot be verified, we propose a simple yet powerful information theoretical framework—the Mutual Information Paradigm—for information elicitation mechanisms. Our framework pays every agent a measure of mutual information between her signal and a peer’s signal. We require that the mutual information measurement has the key property that any “data processing” on the two random variables will decrease the mutual information between them. We identify such information measures that generalize Shannon mutual information. Our Mutual Information Paradigm overcomes the two main challenges in information elicitation without verification: (1) how to incentivize high-quality reports and avoid agents colluding to report random or identical responses; (2) how to motivate agents who believe they are in the minority to report truthfully. Aided by the information measures, we found (1) we use the paradigm to design a family of novel mechanisms where truth-telling is...
The 29th Annual ACM-SIAM Symposium on Discrete Algorithms, 2018
We study opinion dynamics on networks with two communities. Each node has one of two opinions an... more We study opinion dynamics on networks with two communities. Each node has one of two opinions and updates its opinion as a ``majority-like" function of the frequency of opinions among its neighbors. The networks we consider are weighted graphs comprised of two equally sized communities where intracommunity edges have weight $p$, and intercommunity edges have weight $q$. Thus $q$ and $p$ parameterize the connectivity between the two communities.
We prove a dichotomy theorem about the interaction of the two parameters: 1) the ``majority-like" update function, and 2) the level of intercommunity connectivity. For each setting of parameters, we show that either: the system quickly converges to consensus with high probability in time $\Theta(n \log(n))$; or, the system can get ``stuck" and take time $2^{\Theta(n)}$ to reach consensus. We note that $O(n \log(n))$ is optimal because it takes this long for each node to even update its opinion.
Technically, we achieve this fast convergence result by exploiting the connection between a family of reinforced random walks and dynamical systems literature. Our main result shows if the system is a reinforced random walk with a gradient-like function, it converges to an arbitrary neighborhood of a local attracting point in $O(n\log n)$ time with high probability. This result adds to the recent literature on saddle-point analysis and shows a large family of stochastic gradient descent algorithm converges to a local minimal in $O(n\log n)$ when the step size $O(1/n)$.
Our opinion dynamics model captures a broad range of systems, sometimes called interacting particle systems, exemplified by the voter model, iterative majority, and iterative $k-$majority processes---which have found use in many disciplines including distributed systems, statistical physics, social networks, and Markov chain theory.
The 18th International Conference on Autonomous Agents and Multiagent Systems, 2019
In this work we look at opinion formation and the effects of two phenomena both of which promote ... more In this work we look at opinion formation and the effects of two phenomena both of which promote consensus between agents connected by ties: influence, agents changing their opinions to match their neighbors; and selection, agents rewiring to connect to new agents when the existing neighbor has a different opinion. In our agent-based model, we assume that only weak ties can be rewired and strong ties do not change. The network structure as well as the opinion landscape thus co-evolve with two important parameters: the probability of influence versus selection; and the fraction of strong ties versus weak ties. Using empirical and theoretical methodologies we discovered that on a two-dimensional spatial network: • With no/low selection the presence of weak ties enables fast consensus. This conforms with the classical theory that weak ties are helpful for quickly mixing and spreading information, and strong ties alone act much more slowly. • With high selection, too many weak ties inhibit any consensus at all-the graph partitions. The weak ties reinforce the differing opinions rather than mixing them. However, sufficiently many strong ties promote convergence, though at a slower pace. We additionally test the aforementioned results using a real network. Our study relates two theoretical ideas: the strength of weak ties-that weak ties are useful for spreading information; and the idea of echo chambers or filter bubbles, that people are typically bombarded by the opinions of like-minded individuals. The difference is in how (much) selection operates.
Social behaviors and choices spread through interactions and may lead to a cascading behavior. Un... more Social behaviors and choices spread through interactions and may lead to a cascading behavior. Understanding how such social cascades spread in a network is crucial for many applications ranging from viral marketing to political campaigns. The behavior of cascade depends crucially on the model of cascade or social influence and the topological structure of the social network.
In this paper we study the general threshold model of cascades which are parameterized by a distribution over the natural numbers, in which the collective influence from infected neighbors, once beyond the threshold of an individual u, will trigger the infection of u. By varying the choice of the distribution, the general threshold model can model cascades with and without the submodular property. In fact, the general threshold model captures many previously studied cascade models as special cases, including the independent cascade model, the linear threshold model, and k-complex contagions.
We provide both analytical and experimental results for how cascades from a general threshold model spread in a general growing network model, which contains preferential attachment models as special cases. We show that if we choose the initial seeds as the early arriving nodes, the contagion can spread to a good fraction of the network and this fraction crucially depends on the fixed points of a function derived only from the specified distribution. We also show, using a coauthorship network derived from DBLP databases and the Stanford web network, that our theoretical results can be used to predict the infection rate up to a decent degree of accuracy, while the configuration model does the job poorly.
Sybil attacks, in which an adversary creates a large number of identities, present a formidable p... more Sybil attacks, in which an adversary creates a large number of identities, present a formidable problem for the robustness of recommendation systems. One promising method of sybil detection is to use data from social network ties to implicitly infer trust.
Previous work along this dimension typically a) assumes that it is difficult/costly for an adversary to create edges to honest nodes in the network; and b) limits the amount of damage done per such edge, using conductance-based methods. However, these methods fail to detect a simple class of sybil attacks which have been identified in online systems. Indeed, conductance-based methods seem inherently unable to do so, as they are based on the assumption that creating many edges to honest nodes is difficult, which seems to fail in real-world settings.
We create a sybil defense system that accounts for the adversary's ability to launch such attacks yet provably withstands them by:
1.Not assuming any restriction on the number of edges an adversary can form, but instead making a much weaker assumption that creating edges from sybils to most honest nodes is difficult, yet allowing that the remaining nodes can be freely connected to. 2.Relaxing the goal from classifying all nodes as honest or sybil to the goal of classifying the "core" nodes of the network as honest; and classifying no sybil nodes as honest. 3.Exploiting a new, for sybil detection, social network property, namely, that nodes can be embedded in low-dimensional spaces.
Uploads
We prove a dichotomy theorem about the interaction of the two parameters: 1) the ``majority-like" update function, and 2) the level of intercommunity connectivity. For each setting of parameters, we show that either: the system quickly converges to consensus with high probability in time $\Theta(n \log(n))$; or, the system can get ``stuck" and take time $2^{\Theta(n)}$ to reach consensus.
We note that $O(n \log(n))$ is optimal because it takes this long for each node to even update its opinion.
Technically, we achieve this fast convergence result by exploiting the connection between a family of reinforced random walks and dynamical systems literature. Our main result shows if the system is a reinforced random walk with a gradient-like function, it converges to an arbitrary neighborhood of a local attracting point in $O(n\log n)$ time with high probability. This result adds to the recent literature on saddle-point analysis and shows a large family of stochastic gradient descent algorithm converges to a local minimal in $O(n\log n)$ when the step size $O(1/n)$.
Our opinion dynamics model captures a broad range of systems, sometimes called interacting particle systems, exemplified by the voter model, iterative majority, and iterative $k-$majority processes---which have found use in many disciplines including distributed systems, statistical physics, social networks, and Markov chain theory.
In this paper we study the general threshold model of cascades which are parameterized by a distribution over the natural numbers, in which the collective influence from infected neighbors, once beyond the threshold of an individual u, will trigger the infection of u. By varying the choice of the distribution, the general threshold model can model cascades with and without the submodular property. In fact, the general threshold model captures many previously studied cascade models as special cases, including the independent cascade model, the linear threshold model, and k-complex contagions.
We provide both analytical and experimental results for how cascades from a general threshold model spread in a general growing network model, which contains preferential attachment models as special cases. We show that if we choose the initial seeds as the early arriving nodes, the contagion can spread to a good fraction of the network and this fraction crucially depends on the fixed points of a function derived only from the specified distribution. We also show, using a coauthorship network derived from DBLP databases and the Stanford web network, that our theoretical results can be used to predict the infection rate up to a decent degree of accuracy, while the configuration model does the job poorly.
Previous work along this dimension typically a) assumes that it is difficult/costly for an adversary to create edges to honest nodes in the network; and b) limits the amount of damage done per such edge, using conductance-based methods. However, these methods fail to detect a simple class of sybil attacks which have been identified in online systems. Indeed, conductance-based methods seem inherently unable to do so, as they are based on the assumption that creating many edges to honest nodes is difficult, which seems to fail in real-world settings.
We create a sybil defense system that accounts for the adversary's ability to launch such attacks yet provably withstands them by:
1.Not assuming any restriction on the number of edges an adversary can form, but instead making a much weaker assumption that creating edges from sybils to most honest nodes is difficult, yet allowing that the remaining nodes can be freely connected to.
2.Relaxing the goal from classifying all nodes as honest or sybil to the goal of classifying the "core" nodes of the network as honest; and classifying no sybil nodes as honest.
3.Exploiting a new, for sybil detection, social network property, namely, that nodes can be embedded in low-dimensional spaces.