Skip to main content
Ian McCulloh
  • Laurel, Maryland, United States
To systematically understand the effects of vulnerabilities introduced by AI/ML-enabled Army Multi-domain Operations, we provide an overview of characterization of ML attacks with an emphasis on black-box vs. white-box attacks. We then... more
To systematically understand the effects of vulnerabilities introduced by AI/ML-enabled Army Multi-domain Operations, we provide an overview of characterization of ML attacks with an emphasis on black-box vs. white-box attacks. We then study a system and attack model for Army MDO applications and services, and introduce the roles of stakeholders in this system. We show, in various attack scenarios and under different knowledges of the deployed system, how peer adversaries can employ deceptive techniques to defeat algorithms, and how the system should be designed to minimize the attacks. We demonstrate the feasibility of our approach in a cyber threat intelligence use case. We conclude with a path forward for design and policy recommendations for robust and secure deployment of AI/ML applications in Army MDO environments.
To facilitate the widespread acceptance of AI systems guiding decision-making in real-world applications, it is key that solutions comprise trustworthy, integrated human-AI systems. Not only in safety-critical applications such as... more
To facilitate the widespread acceptance of AI systems guiding decision-making in real-world applications, it is key that solutions comprise trustworthy, integrated human-AI systems. Not only in safety-critical applications such as autonomous driving or medicine, but also in dynamic open world systems in industry and government it is crucial for predictive models to be uncertainty-aware and yield trustworthy predictions. Another key requirement for deployment of AI at enterprise scale is to realize the importance of integrating human-centered design into AI systems such that humans are able to use systems effectively, understand results and output, and explain findings to oversight committees. While the focus of this symposium was on AI systems to improve data quality and technical robustness and safety, we welcomed submissions from broadly defined areas also discussing approaches addressing requirements such as explainable models, human trust and ethical aspects of AI.
Abstract Research in network monitoring spans a large and growing number of disciplines, including mathematics, physics, computer science, and statistics. Here, the panelists discuss the advantages and disadvantages of the... more
Abstract Research in network monitoring spans a large and growing number of disciplines, including mathematics, physics, computer science, and statistics. Here, the panelists discuss the advantages and disadvantages of the interdisciplinary nature of the area. It is largely agreed that integrating expertise from many disciplines drives innovation in network monitoring development, but several notable barriers are discussed that limit the area’s full potential.
Abstract In this article, the panelists broadly discuss the definition of network monitoring, and how it may be similar to or different from network surveillance and network change-point detection. The discussion uncovers ambiguity and... more
Abstract In this article, the panelists broadly discuss the definition of network monitoring, and how it may be similar to or different from network surveillance and network change-point detection. The discussion uncovers ambiguity and contradictions associated with these terms and we argue that this lack of clarity is detrimental to the field. The panelists also describe existing and emerging applications of network monitoring, which serves to illustrate the wide applicability of the tools and research associated with the field.
One of the most asked questions about ISIS during its occupation of large swathes Iraq is this: What was it like to live under the governance of the group? Using data collected from ordinary Iraqis, the chapter attempts to give a picture... more
One of the most asked questions about ISIS during its occupation of large swathes Iraq is this: What was it like to live under the governance of the group? Using data collected from ordinary Iraqis, the chapter attempts to give a picture of everyday life in ISIS-occupied Iraq. Most Sunni Iraqis who experienced the arrival of ISIS, particularly in Mosul, say the group was largely accepted at first, as an alternative to what was viewed as a corrupt, abusive, and sectarian Iraqi state. In retrospect, however, many of the people interviewed about ISIS’s governance thought that although ISIS was superior in some aspects of governance to the Iraqi state, the group largely wore out its welcome through its brutal imposition of an interpretation of sharia that was far more extreme than even relatively conservative Sunni Iraqis were willing to accept.
Social neuroscience research has demonstrated that those who are like-minded are also ‘like-brained.’ Studies have shown that people who share similar viewpoints have greater neural synchrony with one another, and less synchrony with... more
Social neuroscience research has demonstrated that those who are like-minded are also ‘like-brained.’ Studies have shown that people who share similar viewpoints have greater neural synchrony with one another, and less synchrony with people who ‘see things differently.’ Although these effects have been demonstrated at the ‘group level,’ little work has been done to predict the viewpoints of specific ‘individuals’ using neural synchrony measures. Furthermore, the studies that have made predictions using synchrony-based classification at the individual level used expensive and immobile neuroimaging equipment (e.g. functional magnetic resonance imaging) in highly controlled laboratory settings, which may not generalize to real-world contexts. Thus, this study uses a simple synchrony-based classification method, which we refer to as the ‘neural reference groups’ approach, to predict individuals’ dispositional attitudes from data collected in a mobile ‘pop-up neuroscience’ lab. Using fun...
As a result of the COVID-19 pandemic, many organizations and schools have switched to a virtual environ-ment. Recently, as vaccines have become more readily available, organizations and educational institutions have started shifting from... more
As a result of the COVID-19 pandemic, many organizations and schools have switched to a virtual environ-ment. Recently, as vaccines have become more readily available, organizations and educational institutions have started shifting from virtual environments to physical office spaces and schools. For the highest level of safety and caution with respect to the containment of COVID-19, the shift to in-person interaction requires a thoughtful approach. With the help of an Integer Programming (IP) Optimization model, it is possible to formulate the objective function and constraints to determine a safe way of returning to the office through cohort development. In addition to our IP formulation, we developed a heuristic approximation method. Starting with an initial contact matrix, these methods aim to reduce additional contacts introduced by subgraphs representing the cohorts. These formulations can be generalized to other applications that benefit from constrained community detection.
This paper examines quantity and quality superposter value creation within Coursera Massive Open Online Courses (MOOC) forums using a social network analysis (SNA) approach. The value of quantity superposters (i.e. students who post... more
This paper examines quantity and quality superposter value creation within Coursera Massive Open Online Courses (MOOC) forums using a social network analysis (SNA) approach. The value of quantity superposters (i.e. students who post significantly more often than the majority of students) and quality superposters (i.e. students who receive significantly more upvotes than the majority of students) is assessed using Stochastic Actor-Oriented Modeling (SAOM) and network centrality calculations. Overall, quantity and quality superposting was found to have a significant effect on tie formation within the discussion networks. In addition, quantity and quality superposters were found to have higher-than-average information brokerage capital within their networks.
Changes in observed social networks may signal an underlying change within an organization, and may even predict significant events or behaviors. The breakdown of a team’s effectiveness, the emergence of informal leaders, or the... more
Changes in observed social networks may signal an underlying change within an organization, and may even predict significant events or behaviors. The breakdown of a team’s effectiveness, the emergence of informal leaders, or the preparation of an attack by a clandestine network may all be associated with changes in the patterns of interactions between group members. The ability to systematically, statistically, effectively and efficiently detect these changes has the potential to enable the anticipation, early warning, and faster response to both positive and negative organizational activities. By applying statistical process control techniques to social networks we can rapidly detect changes in these networks. Herein we describe this methodology and then illustrate it using four data sets, of which the first is the Newcomb fraternity data, the second set of data is collected on a group of mid-career U.S. Army officers in a week long training exercise, the third is the perceived con...
After more than a year of non-pharmaceutical interventions, such as, lock-downs and masks, questions remain on how effective these interventions were and could have been. The vast differences in the enforcement of and adherence to... more
After more than a year of non-pharmaceutical interventions, such as, lock-downs and masks, questions remain on how effective these interventions were and could have been. The vast differences in the enforcement of and adherence to policies adds complexity to a problem already surrounded with significant uncertainty. This necessitates a model of disease transmission that can account for these spatial differences in interventions and compliance. In order to measure and predict the spread of disease under various intervention scenarios, we propose a Microscopic Markov Chain Approach (MMCA) in which spatial units each follow their own Markov process for the state of disease but are also connected through an underlying mobility matrix. Cuebiq, an offline intelligence and measurement company, provides aggregated, anonymized cell-phone mobility data which reveal how population behaviors have evolved over the course of the pandemic. These data are leveraged to infer mobility patterns across regions and contact patterns within those regions. The data enables the estimation of a baseline for how the pandemic spread under the true ground conditions, so that we can analyze how different shifts in mobility affect the spread of the disease. We demonstrate the efficacy of the model through a case study of spring break and it’s impact on how the infection spread in Florida during the spring of 2020, at the onset of the pandemic. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
ISIS and similar extremist communities are increasingly using forums in the darknet to connect with each other and spread news and propaganda. In this paper, we attempt to understand their network in an online forum by using descriptive... more
ISIS and similar extremist communities are increasingly using forums in the darknet to connect with each other and spread news and propaganda. In this paper, we attempt to understand their network in an online forum by using descriptive statistics, an exponential random graph model (ERGM) and Topic Modeling. Our analysis shows how the cohesion between active members forms and grows over time and under certain thread topics. We find that the top attendants of the forum have high centrality measures and other attributes of influencers.
: The US Army Research Laboratory (ARL) is currently conducts tests on anti-ballistic armor for military uses. This research is concerned with determining the limit velocity (vL.) of different target penetrator combinations. The limit... more
: The US Army Research Laboratory (ARL) is currently conducts tests on anti-ballistic armor for military uses. This research is concerned with determining the limit velocity (vL.) of different target penetrator combinations. The limit velocity is the highest velocity a penetrator can have without penetrating the targe. Unfortunately, penetration processes are highly complex and an effective first principles derivation of vL has not been discovered. Estimation of vL is therefore done empirically. Furthermore, ballistics tests can be very expensive, resulting in a small size sample with which to perform statistical data analysis. There are two ballistics testing methods commonly used to estimate vL. The Jonas Lambert method involves measuring the residual velocity of the projectile after perforation. The bisection method or V 50 simply evaluates the perforation without residual velocity. The second method is significantly less expensive. Simulation is used to model both of the common ...
Bots are often identified on social media due to their behavior. How easily are they identified, however, when they are dormant and exhibit no measurable behavior at all, except for their silence? We identified “dormant bot networks”... more
Bots are often identified on social media due to their behavior. How easily are they identified, however, when they are dormant and exhibit no measurable behavior at all, except for their silence? We identified “dormant bot networks” positioned to influence social media discourse surrounding the 2018 U.S. senate election. A dormant bot is a social media persona that does not post content yet has large follower and friend relationships with other users. These relationships may be used to manipulate online narratives and elevate or suppress certain discussions in the social media feed of users. Using a simple structure-based approach, we identify a large number of dormant bots created in 2017 that begin following the social media accounts of numerous US government politicians running for re-election in 2018. Findings from this research were used by the U.S. Government to suspend dormant bots prior to the elections to prevent any malign influence campaign. Application of this approach ...
Abstract Traditional statistical process monitoring (SPM) provides a useful starting point for framing and solving network monitoring problems. In this paper the panelists discuss similarities and differences between the two fields and... more
Abstract Traditional statistical process monitoring (SPM) provides a useful starting point for framing and solving network monitoring problems. In this paper the panelists discuss similarities and differences between the two fields and they describe many challenges and open problems in contemporary network monitoring research. The panelists also discuss potential outlets and avenues for disseminating such research.
Basketball is an inherently social sport, which implies that social dynamics within a team may influence the team's performance on the court. As NBA players use social media, it may be possible to study the social structure of a team... more
Basketball is an inherently social sport, which implies that social dynamics within a team may influence the team's performance on the court. As NBA players use social media, it may be possible to study the social structure of a team by examining the relationships that form within social media networks. This paper investigates the relationship between publicly available online social networks and quantitative performance data. It is hypothesized that network centrality measures for an NBA team's network will correlate with measurable performance metrics such as win percentage, points differential and assists per play. The hypothesis is tested using exponential random graph models (ERGM) and investigating correlation between network and performance variables. The results show that there are league-wide trends correlating certain network measures with game performance, and also quantifies the effects of various player attributes on network formation.
The United States is becoming increasingly politically divided. In addition to polarization between the two-major political parties, there is also divisiveness in intra-party dynamics. In this paper, we attempt to understand these... more
The United States is becoming increasingly politically divided. In addition to polarization between the two-major political parties, there is also divisiveness in intra-party dynamics. In this paper, we attempt to understand these intraparty divisions by using an exponential random graph model (ERGM) to compute a political cohesion metric to quantify the strength within the party at a given point in time. The analysis is applied to the 105th through 113th congressional sessions of the House of Representatives. We find that the Republican party not only generally exhibits stronger intra-party cohesion, but when voting patterns are broken out by topic, the party has a higher and more consistent cohesion factor compared to the Democratic Party.
The novel coronavirus, SARS-CoV-2, commonly known as COVID19 has become a global pandemic in early 2020. The world has mounted a global social distancing intervention on a scale thought unimaginable prior to this outbreak; however, the... more
The novel coronavirus, SARS-CoV-2, commonly known as COVID19 has become a global pandemic in early 2020. The world has mounted a global social distancing intervention on a scale thought unimaginable prior to this outbreak; however, the economic impact and sustainability limits of this policy create significant challenges for government leaders around the world. Understanding the future spread and growth of COVID19 is further complicated by data quality issues due to high numbers of asymptomatic patients who may transmit the disease yet show no symptoms; lack of testing resources; failure of recovered patients to be counted; delays in reporting hospitalizations and deaths; and the co-morbidity of other life-threatening illnesses. We propose a Monte Carlo method for inferring true case counts from observed deaths using clinical estimates of Infection Fatality Ratios and Time to Death. Findings indicate that current COVID19 confirmed positive counts represent a small fraction of actual...
... Pages: 9. Pub Types: Journal Articles; Reports - Evaluative. Abstract: This study introduces a new method of evaluating human comprehension in the context of machine translation using a language translation program known as the FALCon... more
... Pages: 9. Pub Types: Journal Articles; Reports - Evaluative. Abstract: This study introduces a new method of evaluating human comprehension in the context of machine translation using a language translation program known as the FALCon (Forward Area Language Converter) ...
Novel diseases such as COVID-19 present challenges for identifying and assessing the impact of public health interventions due to incomplete and inaccurate data. Many infected persons may be asymptomatic, pre-symptomatic, or may choose to... more
Novel diseases such as COVID-19 present challenges for identifying and assessing the impact of public health interventions due to incomplete and inaccurate data. Many infected persons may be asymptomatic, pre-symptomatic, or may choose to not seek medical treatment. Insufficient testing and reporting standards coupled with reporting delays may also affect the accuracy of case count, recovery rate, fatalities and other key metrics used to model the disease. High error in these metrics are propagated to all aspects of public health response including estimates of daily transmission rates. We propose a method that integrates Monte Carlo simulation based on clinical studies, linear noise approximation (LNA), and Hidden Markov Models (HMMs) to estimate daily reproductive number. Results are validated against known state population behavior, such as social distancing and stay-at-home orders. The proposed approach provides improved model initial conditions resulting in reduced error and su...
Information operations on social media have recently attracted the attention of media outlets, research organizations and governments, given the proliferation of high-profile cases such as the alleged foreign interference in the 2016 US... more
Information operations on social media have recently attracted the attention of media outlets, research organizations and governments, given the proliferation of high-profile cases such as the alleged foreign interference in the 2016 US presidential election. Nation-states and multilateral organizations continue to face challenges while attempting to counter false narratives, due to lack of familiarity and experience with online environments, limited knowledge and theory of human interaction with and within these spaces, and the limitations imposed by those who own and maintain social media platforms. In particular, these attributes present unique difficulties for the identification and attribution of campaigns, tracing information flows at scale, and identifying spheres of influence. Complications include the anonymity and competing motivations of online actors, poorly understood platform dynamics, and the sparsity of information regarding message transferal across communication pl...
Based on a comprehensive study of 20 established data sets, we recommend training set sizes for any classification data set. We obtain our recommendations by systematically withholding training data and developing models through five... more
Based on a comprehensive study of 20 established data sets, we recommend training set sizes for any classification data set. We obtain our recommendations by systematically withholding training data and developing models through five different classification methods for each resulting training set. Based on these results, we construct accuracy confidence intervals for each training set size and fit the lower bounds to inverse power low learning curves. We also estimate a sufficient training set size (STSS) for each data set based on established convergence criteria. We compare STSS to the data sets' characteristics; based on identified trends, we recommend training set sizes between 3000 and 30000 data points, according to a data set's number of classes and number of features. Because obtaining and preparing training data has non-negligible costs that are proportional to data set size, these results afford the potential opportunity for substantial savings for predictive mode...
Network science has been applied in the hard and soft sciences for several decades. Founded in graph theory, network science is now an expansive approach to the analyses of complex networks of many types of objects (events, people,... more
Network science has been applied in the hard and soft sciences for several decades. Founded in graph theory, network science is now an expansive approach to the analyses of complex networks of many types of objects (events, people, locations, etc.). Researchers are finding that techniques and tools used in social network analysis have relevant application in projects that span more than just relationships between people. This paper discusses the application of network analysis in a postgraduate course on information security and risks in organisational settings as a special topic course.
: Network data provides valuable insight into understanding complex organizations by modeling relational dependence between network agents. Detecting subtle changes in organizational behavior can alert analysts before the change... more
: Network data provides valuable insight into understanding complex organizations by modeling relational dependence between network agents. Detecting subtle changes in organizational behavior can alert analysts before the change significantly impacts the larger group. Statistical process control is applied to dynamic network measures of longitudinal data to quickly detect organizational change. The performance of 10 network measures and three algorithms are evaluated on simulated data. One of the algorithms and one of the network measures are used to demonstrate change detection on the Al-Qaeda terrorist network. There is no statistically significant difference in the performance of investigated algorithms, however, the cumulative sum control chart has a built-in estimate of the actual time a change may have occurred.
With the rise in popularity of social media, these platforms present a new opportunity to reach potential job candidates for employment opportunities. The current literature lacks sufficient research on methods and best practices to... more
With the rise in popularity of social media, these platforms present a new opportunity to reach potential job candidates for employment opportunities. The current literature lacks sufficient research on methods and best practices to design and assess the efficacy of recruit and hire campaigns delivered on social media. We present a case study of a government e-recruiting effort discovered on Twitter. We collected almost 20 thousand tweets using the hashtag #FBIJobs, this included both Tweets and Retweets. Applications of descriptive statistics, topic modeling, sentiment analysis, and graph analytics identify where the campaign may miss potentially interested job candidates. We also find evidence of “popularity transfer” where co-mentions appear to increase the visibility of an accounts content in public feeds, without transferring the sentiment surrounding the more popular account. The research and findings were based on a publicly available e-recruiting campaign found online, witho...
Current supervised deep learning frameworks rely on annotated data for modeling the underlying data distribution of a given task. In particular for computer vision algorithms powered by deep learning, the quality of annotated data is the... more
Current supervised deep learning frameworks rely on annotated data for modeling the underlying data distribution of a given task. In particular for computer vision algorithms powered by deep learning, the quality of annotated data is the most critical factor in achieving the desired algorithm performance. Data annotation is, typically, a manual process where the annotator follows guidelines and operates in a best-guess manner. Labeling criteria among annotators can show discrepancies in labeling results. This may impact the algorithm inference performance. Given the popularity and widespread use of deep learning among computer vision, more and more custom datasets are needed to train neural networks to tackle different kinds of tasks. Unfortunately, there is no full understanding of the factors that affect annotated data quality, and how it translates into algorithm performance. In this paper we studied this problem for object detection and recognition.We conducted several data anno...
A new model for a random graph is proposed that can be constructed from empirical data and has some desirable properties compared to scale-free graphs [1, 2, 3] for certain applications. The newly proposed random graph maintains the same... more
A new model for a random graph is proposed that can be constructed from empirical data and has some desirable properties compared to scale-free graphs [1, 2, 3] for certain applications. The newly proposed random graph maintains the same "small-world" properties [3, 4, 5] of the scale-free graph, while allowing mathematical modeling of the relationships that make up the random graph. E-mail communication data was collected on a group of 24 mid-career Army officers in a one-year graduate program [6] to validate necessary assumptions for this new class of random graphs. Statistical distributions on graph level measures are then approximated using Monte Carlo simulation and used to detect change in a graph over time.
Humans are autonomous, intelligent, and adaptive agents. By adopting social network analysis techniques, we submit a framework for the study of dynamic networks and demonstrate the use of actor-oriented specifications in longitudinal... more
Humans are autonomous, intelligent, and adaptive agents. By adopting social network analysis techniques, we submit a framework for the study of dynamic networks and demonstrate the use of actor-oriented specifications in longitudinal networks. Through the use of a unique command and control dataset from experiments run at the US Military Academy, we illustrate the power of testing hypotheses on actor utility profiles. We frame static, covariate factors onto communication networks, and find that statistical hypothesis testing indicates edge networks truly motivate soldiers to seek information, collaborate, and modify the social network around them into more comfortable configurations of triad closure and edge reciprocity, when compared to hierarchical networks: a finding with profound implications to the study of complex, adaptive social systems.
Extracting (social) network data and conducting effective searches of large document collections requires large corpora of labelled, annotated training data from which to build and validate classifiers. As the importance and value of data... more
Extracting (social) network data and conducting effective searches of large document collections requires large corpora of labelled, annotated training data from which to build and validate classifiers. As the importance and value of data grows, industry and government organizations are investing in large teams of individuals who annotate data at unprecedented scale. While much is understood about machine learning, little attention is applied to methods and considerations for managing and leading annotation efforts. This paper presents several metrics to measure and monitor performance and quality in large annotation teams. Recommendations for leadership best practices are proposed and evaluated within the context of an annotation effort led by the authors in support of U.S. government intelligence analysis. Findings demonstrate significant improvement in annotator utilization, inter-annotator agreement, and rate of annotation through prudent management best-practices.
Understanding how underlying health conditions and social determinants of health affect the severity of COVID-19 is critical for community response planning. Literature reports that groups at higher risk from COVID-19 include those 65 and... more
Understanding how underlying health conditions and social determinants of health affect the severity of COVID-19 is critical for community response planning. Literature reports that groups at higher risk from COVID-19 include those 65 and older, living in nursing homes and long-term care facilities, and with severe obesity, diabetes, chronic lung disease, or asthma. In addition, other studies has shown that the disease disproportionately affects individuals with lower socio-economic status. Our research seeks to validate these findings and observe the effects of health measures and social determinants of health on COVID-19 mortality at the county-level. In addition to COVID-19 research from hospital population samples, public health officials can leverage county-level factors for novel disease mitigation. We use the Johns Hopkins University COVID-19 reports of confirmed cases and deaths to measure disease mortality for each county in the United States. Then, we compare mortality to ...
As machine learning becomes a more mainstream technology, the objective for governments and public sectors is to harness the power of machine learning to advance their mission by revolutionizing public services. Motivational government... more
As machine learning becomes a more mainstream technology, the objective for governments and public sectors is to harness the power of machine learning to advance their mission by revolutionizing public services. Motivational government use cases require special considerations for implementation given the significance of the services they provide. Not only will these applications be deployed in a potentially hostile environment that necessitates protective mechanisms, but they are also subject to government transparency and accountability initiatives which further complicates such protections. In this paper, we describe how the inevitable interactions between a user of unknown trustworthiness and the machine learning models, deployed in governments and public sectors, can jeopardize the system in two major ways: by compromising the integrity or by violating the privacy. We then briefly overview the possible attacks and defense scenarios, and finally, propose recommendations and guide...
Centrality in a social network is found to have a significant effect on Asch-type conformity. Friendship affinity and respect social network data was collected on two different groups of actors. The effects of Asch-type conformity were... more
Centrality in a social network is found to have a significant effect on Asch-type conformity. Friendship affinity and respect social network data was collected on two different groups of actors. The effects of Asch-type conformity were empirically tested on central actors and peripheral actors in each group using a culturally appropriate version of Asch’s test. Findings show that central actors are less willing to conform and peripheral actors are more willing to conform than expected in Asch-type social conformity experiments. Authors Ian McCulloh, Ph.D. is a visiting research fellow at the Centre for Organisational Analysis in the School of Information Systems at the Curtin University Business School in Perth, Western Australia. Please email all correspondence to Ian McCulloh at cusum6@gmail.com. Notes Thanks to Dominick Lombardi for his assistance in designing, executing and recording data for the empirical experiments. B R I E F R E P O R T S 331036_B.indd 42 7/26/13 10:27 AM So...
: Email provides a rich source of longitudinal social network data that can be used for applications ranging from command and control, to military intelligence, to basic social science research. This project reviews several methods... more
: Email provides a rich source of longitudinal social network data that can be used for applications ranging from command and control, to military intelligence, to basic social science research. This project reviews several methods available to extract email network data and compares them in terms of data quality and convenience of collection. In general, it is preferable to obtain email data directly from the central SMTP email server. In situations where this is not possible, alternative approaches presented here can be useful. These techniques for analyzing email data have been automated in the Organizational Risk Analyzer (ORA) software, which is freely available to DoD and academia.
As social media use grows and increasingly becomes a forum for social debate in politics, social issues, sports, and brand sentiment; accurately classifying social media sentiment remains an important computational challenge. Social media... more
As social media use grows and increasingly becomes a forum for social debate in politics, social issues, sports, and brand sentiment; accurately classifying social media sentiment remains an important computational challenge. Social media posts present numerous challenges for text classification. This paper presents an approach to introduce guided decision trees into the design of a crowdsourcing platform to extract additional data features, reduce task cognitive complexity, and improve the quality of the resulting labeled text corpus. We compare the quality of the proposed approach with off-the-shelf sentiment classifiers and a crowdsourced solution without a decision tree using a tweet sample from the social media firestorm #CancelColbert. We find that the proposed crowdsource with decision tree approach results in a training corpus with higher quality, necessary for effective classification of social media content.
Twitter has become an important tool for communication and marketing. Topic model algorithms meant to characterize the discourse of online conversations and identify relevant audiences do not perform well for this task, despite their... more
Twitter has become an important tool for communication and marketing. Topic model algorithms meant to characterize the discourse of online conversations and identify relevant audiences do not perform well for this task, despite their widespread usage. This paper proposes an iterative topic model, Gamma Filtration, and a social network-based method, Simmelian Filtration, to amplify tweet-topic probability signal and reduce noise. We demonstrate the method on a novel data set collected of European Racially and Ethnically Motivated Violent Extremist (REMVE) networks on Twitter. We find that Simmelian Filtering is most successful at reducing noise as measured by perplexity. This improves our ability to detect and monitor core conversations of a community that is disseminating propaganda to increase online extremism.
Initial scientific studies suggest the spread of extreme content, “the COVID-19 infodemic,” likely plays a crucial role in news spread about the “the COVID-19 pandemic.” In this paper, we quantify the evolution of polarization and... more
Initial scientific studies suggest the spread of extreme content, “the COVID-19 infodemic,” likely plays a crucial role in news spread about the “the COVID-19 pandemic.” In this paper, we quantify the evolution of polarization and engagement in YouTube social networks about public-health interventions for COVID-19. Although YouTube is a major information and news source with high engagement with younger populations, the platform is not widely researched in social network analysis. Discussions about coronavirus on social media can influence how individuals interpret news about the disease and affect their compliance with various non-pharmaceutical interventions. We compare coronavirus video content by identifying three subgroups of public-health intervention-related videos: individual interventions, government interventions, and medical interventions, as well as seven video title narratives. The polarization index measures the level of agreement with the video content using votes: li...
Customer satisfaction surveys, which have been the most common way of gauging customer feedback, involve high costs, require customer active participation, and typically involve low response rates. The tremendous growth of social media... more
Customer satisfaction surveys, which have been the most common way of gauging customer feedback, involve high costs, require customer active participation, and typically involve low response rates. The tremendous growth of social media platforms such as Twitter provides businesses an opportunity to continuously gather and analyze customer feedback, with the goal of identifying and rectifying issues. This paper examines the alternative of replacing traditional customer satisfaction surveys with social media data. To evaluate this approach the following steps were taken, using customer feedback data extracted from Twitter: 1) Applying sentiment to each Tweet to compare the overall sentiment across different products and/or services. 2) Constructing a hashtag co-occurrence network to further optimize the customer feedback query process from Twitter. 3) Comparing customer feedback from survey responses with social media feedback, while considering content and added value. We find that social media provides advantages over traditional surveys.
Sectarian violence continues in Iraq affecting regional and world security. Neuroscience techniques are used to assess the mentalizing process and counter-arguing in response to videos designed to prevent extremist radicalization.... more
Sectarian violence continues in Iraq affecting regional and world security. Neuroscience techniques are used to assess the mentalizing process and counter-arguing in response to videos designed to prevent extremist radicalization. Measurement of neural activity in brain Regions of Interest (ROI) assists identification of messages which can promote favorable behavior. Activation of the Medial Prefrontal Cortex (MPFC) is associated with message adoption and behavior change. Public Service Announcements (PSAs) have not been effective in reducing violence in Iraq. This study demonstrates that the four PSAs investigated in this study do not activate the MPFC. The RLPFC is a brain ROI associated with counter-arguing and message resistance. This study demonstrates that reduction in activity in the Right Lateral Prefrontal Cortex (RLPFC) is associated with decreased sectarianism. Engagement was measured and is associated with activity in the frontal pole regions.We introduce Functional Near...
Many disinformation and propaganda campaigns on social media platforms deploy social bots-artificial accounts that pose as humans-to disseminate political propaganda. To understand the effects of the presence of bots as conduits for... more
Many disinformation and propaganda campaigns on social media platforms deploy social bots-artificial accounts that pose as humans-to disseminate political propaganda. To understand the effects of the presence of bots as conduits for information transference in online social networks, we use Exponential Family Random Graph Models (ERGMs) to examine the structure of a bot network during a propaganda campaign in Ecuador in October 2019. We find heterophily and transitivity between bot and human actors, but that bots are less likely to engage with humans than with other bots. This may represent a tactic deliberately deployed to maximize influence by exploiting Twitter's algorithm for showing content. The use of ERGMs produces greater insight into this Twitter bot network, and we believe that this methodology should be extended in future work.
Network evolution is an important problem for social scientists, management consultants, and social network scholars. Unfortunately, few empirical data sets exist that have sufficient data to fully explore evolution dynamics.... more
Network evolution is an important problem for social scientists, management consultants, and social network scholars. Unfortunately, few empirical data sets exist that have sufficient data to fully explore evolution dynamics. Increasingly, more and more online data sets are used in lieu of offline, face-to-face data. The veracities of these findings are questionable, however, because there are few studies exploring the similarity of online-offline dynamics. The IkeNet project investigated online and offline network evolution. Empirical data was collected on a group of 22 mid-career military officers going through a one-year graduate program. Data collection included email communication collected from the Exchange server, as well as self-reported friendship, and time spent together, over a course of 20 weeks. Numerous attribute data on the individual actors was collected from their military personnel files. The data allows network scholars to conduct research into the dynamics of net...
A comprehensive introduction to social network analysis that hones in on basic centrality measures, social links, subgroup analysis, data sources, and more Written by military, industry, and business professionals, this book introduces... more
A comprehensive introduction to social network analysis that hones in on basic centrality measures, social links, subgroup analysis, data sources, and more Written by military, industry, and business professionals, this book introduces readers to social network analysis, the new and emerging topic that has recently become of significant use for industry, management, law enforcement, and military practitioners for identifying both vulnerabilities and opportunities in collaborative networked organizations. Focusing on models and methods for the analysis of organizational risk, Social Network Analysis with Applications provides easily accessible, yet comprehensive coverage of network basics, centrality measures, social link theory, subgroup analysis, relational algebra, data sources, and more. Examples of mathematical calculations and formulas for social network measures are also included. Along with practice problems and exercises, this easily accessible book covers: * The basic concepts of networks, nodes, links, adjacency matrices, and graphs * Mathematical calculations and exercises for centrality, the basic measures of degree, betweenness, closeness, and eigenvector centralities * Graph-level measures, with a special focus on both the visual and numerical analysis of networks * Matrix algebra, outlining basic concepts such as matrix addition, subtraction, multiplication, and transpose and inverse calculations in linear algebra that are useful for developing networks from relational data * Meta-networks and relational algebra, social links, diffusion through networks, subgroup analysis, and more An excellent resource for practitioners in industry, management, law enforcement, and military intelligence who wish to learn and apply social network analysis to their respective fields, Social Network Analysis with Applications is also an ideal text for upper-level undergraduate and graduate level courses and workshops on the subject.
Page 1. THE FLORIDA STATE UNIVERSITY COLLEGE OF ENGINEERING GENERALIZED CUMULATIVE SUM CONTROL CHARTS By IAN MCCULLOH A Thesis submitted to the Department of Industrial Engineering in partial ...
Abstract : OUTLINE: (1) Objective problem; (2) Why simulation? (3) System specification; (4) Experiment and analysis; (5) MA206 probability and statistics; (6) Sensitive equipment decontamination; (7) Conclusions.
Abstract : This paper suggests an improved measure for evaluating the usefulness of automated machine language translators. With the Global War on Terror (GWOT), the Army has increasing interest and need for accurate language translation... more
Abstract : This paper suggests an improved measure for evaluating the usefulness of automated machine language translators. With the Global War on Terror (GWOT), the Army has increasing interest and need for accurate language translation more than ever. Today, there are approximately 20,000 linguists with language training in either the Active Duty or Reserve components of the U.S. Army. Coalition operations and U.S. presence in Iraq, Kuwait, and other areas in the Middle East require Arabic translation. Unfortunately, the Army has never been able to maintain the number of linguists it needs, particularly in the hard-to-fill, low-density languages. Previous evaluations of machine translations usually rely on word error rate. Machine translation systems should be rated not in terms of their word error rate but in terms of human comprehension and usefulness, which is some function of word translation, syntax translation, and semantic interpretation. This study introduces a new method of evaluating human comprehension in the context of machine translation using a language translation program known as the Forward Area Language Converter (FALCon). A study was conducted where participants received seven translated articles in a random order. For each of the seven articles, the participants received a set of corresponding comprehension questions. The goal of the questions was to gear the reader toward intelligence gathering and to see if he could grasp main concepts and details. The results of this study suggest that word error rate is not an effective measure of the usefulness of a machine language translator. Comprehension tests perform better at evaluating a human's understanding of a translated document. This study further indicates strengths and weaknesses in each translator.
Abstract : Text analysis is a new tool with many interesting possibilities for intelligence-gathering. Software being developed by Carnegie Mellon University can output a mental model of a text with the top six concepts in that text. This... more
Abstract : Text analysis is a new tool with many interesting possibilities for intelligence-gathering. Software being developed by Carnegie Mellon University can output a mental model of a text with the top six concepts in that text. This can be used to automatically analyze thousands of texts to search for keywords, find trends over time, or compare two different geographic areas. The problem is that most of the texts that intelligence analysts would use are not in English. Machine translators like the Forward Area Language Converter (FALCon) produce English text that is hard for the average person to read but this machine-translated text appears to be just as useful for text analysis as human-translated text. Using machines instead of humans to translate text can save intelligence agencies time and money.
... Pages: 9. Pub Types: Journal Articles; Reports - Evaluative. Abstract: This study introduces a new method of evaluating human comprehension in the context of machine translation using a language translation program known as the FALCon... more
... Pages: 9. Pub Types: Journal Articles; Reports - Evaluative. Abstract: This study introduces a new method of evaluating human comprehension in the context of machine translation using a language translation program known as the FALCon (Forward Area Language Converter) ...
Social network analysis (SNA) has become an important analytic tool for analyzing terrorist networks, friendly command and control structures, arms trade, biological warfare, the spread of diseases, among other applications. Detecting... more
Social network analysis (SNA) has become an important analytic tool for analyzing terrorist networks, friendly command and control structures, arms trade, biological warfare, the spread of diseases, among other applications. Detecting dynamic changes over time from an SNA perspective, may signal an underlying change within an organization, and may even predict significant events or behaviors. The challenges in detecting network change includes the lack of underlying statistical distributions to quantify significant change, as well as high relational dependence affecting assumptions of independence and normality. Additional challenges involve determining an algorithm that maximizes the probability of detecting change, given a risk level for false alarm. A suite of computational and statistical approaches for detecting change are identified and compared. The Neyman-Pearson most powerful test of simple hypotheses is extended as a cumulative sum statistical process control chart to detect network change over time. Anomaly detection approaches using exponentially weighted moving average or scan statistics investigate performance under conditions of potential time-series dependence. Fourier analysis and wavelets are applied to a spectral analysis of social networks over time. Parameter values are varied for all approaches. The results are put in a computational decision support framework. This new approach is demonstrated in multi-agent simulation as well as on eight different real-world data sets. The results indicate that this approach is able to detect change even with high levels of uncertainty inherent in the data. The ability to systematically, statistically, effectively and efficiently detect these changes has the potential to enable the anticipation of change, provide early warning of change, and enable faster appropriate response to change.
Abstract : The Armament Research, Development and Engineering Center, Picatinny (ARDEC) has been tasked with developing a new chemical process to produce lead azide, the key explosive ingredient in detonators. The new process is... more
Abstract : The Armament Research, Development and Engineering Center, Picatinny (ARDEC) has been tasked with developing a new chemical process to produce lead azide, the key explosive ingredient in detonators. The new process is physically smaller than the traditional process, and incorporates newer technologies to improve process safety while reducing the costs of setting up new production. The new process has been shown to produce lead azide, but the process settings and operations have not yet been fully characterized. Chemical engineers at ARDEC, Picatinny were also unable to produce a lead azide that met military specifications. A shortage of lead azide has placed our country's ability to manufacture detonators in jeopardy, so the timely completion of this effort is important. This research project used a statistically designed experiment and response surface methods to optimize the process settings. The objective is to discover the ideal process settings to produce a lead azide that meets military specifications and is similar to the lead azide produced by the original process. An optimum (minimum) number of process setting trials is required as a single experimental can take over a week to fully analyze. Therefore, sequential and efficient experimentation is critical.
Business organizations are held together not only by formal reporting and authority networks but also by informal networks that connect people across numerous layers of hierarchical organizational structures. People form networks of... more
Business organizations are held together not only by formal reporting and authority networks but also by informal networks that connect people across numerous layers of hierarchical organizational structures. People form networks of contacts and communications and through these networks they ‘get things done’. Although extensive research has been carried out on social networks the application of these methods to organizational risk has not been widely published. However, network analysis does provide a source of information on potential risks to aid decision-makers within an organization. The application of network analyses in identifying and measuring potential risks based upon the analyses of people, knowledge, tasks and resources is presented in this paper.
2009 Technical Reports by Author Institute for Software Research School of Computer Science, Carnegie Mellon University. ABI-ANTOUN, Marwan CMU-ISR-09-113. ALDRICH, Jonathan CMU-ISR-09-100, CMU-ISR-09-101. ALTMAN,... more
2009 Technical Reports by Author Institute for Software Research School of Computer Science, Carnegie Mellon University. ABI-ANTOUN, Marwan CMU-ISR-09-113. ALDRICH, Jonathan CMU-ISR-09-100, CMU-ISR-09-101. ALTMAN, NealCMU-ISR-09-122,CMU-ISR- 09-123. BARNES, Jeffrey M. CMU-ISR-09-113. BECKMAN, Nels E. CMU-ISR-09-100, CMU-ISR-09-101. BIERHOFF, Kevin CMU-ISR-09-101, CMU-ISR-09-108. BIGRIGG, Michael W. CMU-ISR-09-114, CMU-ISR-09-117, CMU-ISR-09-124. ...
We review the k-truss algorithm for community detection in networks. The k-truss is an efficient clustering algorithm that holds advantageous properties for many network applications. In this paper, we compare the k-truss performance... more
We review the k-truss algorithm for community detection in networks. The k-truss is an efficient clustering algorithm that holds advantageous properties for many network applications. In this paper, we compare the k-truss performance against other, more well-known community detection algorithms. The k-truss is uniformly more computationally efficient than the Louvain and Clauset-Newman-Moore algorithms in terms of speed and memory with comparable modularity. Potential applications are discussed.
The practice of students going to the boards to work problems has been a tradition in undergraduate education for over two hundred years. A study was conducted to determine what instructional techniques enabled students to better learn... more
The practice of students going to the boards to work problems has been a tradition in undergraduate education for over two hundred years. A study was conducted to determine what instructional techniques enabled students to better learn fundamental concepts in mathematics. This study identifies board work as a highly effective instructional technique for developing a student's ability to succeed on
Industrial manufacturing processes can experience a variety of changes to important quality characteristics as a result of tool breakage, tool wear, introduction of new raw materials, and other factors. Statistical process control charts... more
Industrial manufacturing processes can experience a variety of changes to important quality characteristics as a result of tool breakage, tool wear, introduction of new raw materials, and other factors. Statistical process control charts are often used to monitor for changes in quality characteristics for manufacturing processes. The control chart computes a statistic based on measured observations of the process and
... Table 1. Significant Change Points in θ. ... This meeting coincides with the Fortune Magazine Article that questions Enron's stock price; legal questions raised about LJM, a company used to hide Enron debt; and problems with... more
... Table 1. Significant Change Points in θ. ... This meeting coincides with the Fortune Magazine Article that questions Enron's stock price; legal questions raised about LJM, a company used to hide Enron debt; and problems with the Raptor partnership. ...
A new model for a random graph is proposed that can be constructed from empirical data and has some desirable properties compared to scale-free graphs (1, 2, 3) for certain applications. The newly proposed random graph maintains the same... more
A new model for a random graph is proposed that can be constructed from empirical data and has some desirable properties compared to scale-free graphs (1, 2, 3) for certain applications. The newly proposed random graph maintains the same "small-world" properties (3, 4, 5) of the scale-free graph, while allowing mathematical modeling of the relationships that make up the random
The literature suggests a growing interest in the application of network analysis in supply chain management. However this has been at the organizational rather than the process level. We believe there is value in applying such analysis... more
The literature suggests a growing interest in the application of network analysis in supply chain management. However this has been at the organizational rather than the process level. We believe there is value in applying such analysis to internal processes in supply chain networks. This study uses network analysis techniques to investigate the Stewart (8) framework for excellence in supply
ABSTRACT Two key problems in the study of longitudinal networks are determining when to chunk continuous time data into discrete time periods for network analysis and identifying periodicity in the data. In addition, statistical process... more
ABSTRACT Two key problems in the study of longitudinal networks are determining when to chunk continuous time data into discrete time periods for network analysis and identifying periodicity in the data. In addition, statistical process control applied to longitudinal social network measures can be biased by the effects of relational dependence and periodicity in the data. Thus, the detection of change is often obscured by random noise. Fourier analysis is used to determine statistically significant periodic frequencies in longitudinal network data. Two approaches are then offered: using significant periods as a basis to chunk data for longitudinal network analysis or using the significant periods to filter the longitudinal data. E-mail communication collected at the United States Military Academy is examined.