Abstract
Maintaining a productive and collaborative team of developers is essential to Open Source Software (OSS) success, and hinges upon the trust inherent among the team. Whether a project participant is initiated as a committer is a function of both his technical contributions and also his social interactions with other project participants. One’s online social footprint is arguably easier to ascertain and gather than one’s technical contributions e.g., gathering patch submission information requires mining multiple sources with different formats, and then merging the aliases from these sources. In contrast to prior work, where patch submission was found to be an essential ingredient to achieving committer status, here we investigate the extent to which the likelihood of achieving that status can be modeled solely as a social network phenomenon. For 6 different Apache Software Foundation OSS projects we compile and integrate a set of social measures of the communications network among OSS project participants and a set of technical measures, i.e., OSS developers’ patch submission activities. We use these sets to predict whether a project participant will become a committer, and to characterize their socialization patterns around the time of becoming committer. We find that the social network metrics, in particular the amount of two-way communication a person participates in, are more significant predictors of one’s likelihood to becoming a committer. Further, we find that this is true to the extent that other predictors, e.g., patch submission info, need not be included in the models. In addition, we show that future committers are easy to identify with great fidelity when using the first three months of data of their social activities. Moreover, only the first month of their social links are a very useful predictor, coming within 10 % of the three month data’s predictions. Interestingly, we find that on average, for each project, one’s level of socialization ramps up before the time of becoming a committer. After obtaining committer status, their social behavior is more individualized, falling into few distinct modes of behavior. In a significant number of projects, immediately after the initiation there is a notable social cooling-off period. Finally, we find that it is easier to become a committer earlier in the projects life cycle than it is later as the project matures. These results should provide insight on the social nature of gaining trust and advancing in status in distributed projects.
Similar content being viewed by others
Notes
Issue trackers also capture communication between committers and developers. We did not use those because the mailing lists contained a large enough communication sample which was not obviously biased in any way
References
Ashton MC, Lee K, Paunonen SV (2002) What is the central feature of extraversion? Social attention versus reward sensitivity. J Pers Soc Psychol 83(1):245
Anderson A, Huttenlocher D, Kleinberg J, Leskovec J (2013) Steering user behavior with badges. In: WWW. ACM, pp 95–106
Butler BS (2001) Membership size, communication activity, and sustainability: a resource-based model of online social structures. Inf Syst Res 12(4):346–362
Bird C, Gourley A, Devanbu P, Swaminathan A, Hsu G (2007) Open borders? pImmigration in open source projects. In: MSR. IEEE, p 6
Bettenburg N, Hassan AE (2010) Studying the impact of social structures on software quality. In: ICPC. IEEE, pp 124–133
Bird C, Nagappan N, Devanbu P, Gall H, Murphy B (2009) Does distributed development affect software quality? An empirical case study of Windows Vista. Commun ACM 52(8):85–93
Bird C, Gourley A, Devanbu P, Gertz M, Swaminathan A (2006) Mining email social networks. In: MSR. ACM, pp. 137–143
Begel A, Simon B (2008) Novice software developers, all over again. In: Proceedings of the 4th international workshop on computing education research. ACM, pp 3–14
Bauer TN, Erdogan B (2011) Organizational socialization: the effective onboarding of new employees
Bettenburg N, Shihab E, Hassan AE (2009) An empirical study on the risks of using off-the-shelf techniques for processing mailing list data. In: ICSM. IEEE, pp 539–542
Bauer TN, Bodner T, Erdogan B, Truxillo DM, Tucker JS (2007) Newcomer adjustment during organizational socialization: a meta-analytic review of antecedents, outcomes, and methods. J Appl Psychol 92(3):707
Crowston K, Wei K, Howison J, Wiggins A (2012) Free/libre open-source software development: What we know and what we do not know. ACM Comput Surv (CSUR) 44(2):7
Cataldo M, Herbsleb JD, K M Carley (2008) Socio-technical congruence: a framework for assessing the impact of technical and work dependencies on software development productivity. In: ESEM. ACM, pp 2–11
Crowston K, Howison J (2005) The social structure of free and open source software development. First Monday 10(2)
Cheng R, Vassileva J (2006) Design and evaluation of an adaptive incentive mechanism for sustained educational online communities. User Model. User-Adap Inter 16(3–4):321–348
Cohen J (2003) Applied multiple regression/correlation analysis for the behavioral sciences. Lawrence Erlbaum
Cleveland WS (1979) Robust locally weighted regression and smoothing scatterplots. J Am Stat Assoc 74(368):829–836
Ducheneaut N (2005) Socialization in an open source software community: a socio-technical analysis. CSCW 14(4):323–368
De Souza C, Froehlich J, Dourish P (2005) Seeking the source: software source code as a social and technical artifact. In: SIGGROUP. ACM, pp 197–206
Depue RA, Collins PF (1999) Neurobiology of the structure of personality: Dopamine, facilitation of incentive motivation, and extraversion. Behav Brain Sci 22(03):491–517
Deterding S, Sicart M, Nacke L, O’Hara K, Dixon D (2011) Gamification. using game-design elements in non-gaming contexts. In: CHI. ACM, pp 2425–2428
Dai G, De Meuse KP (2007) A review of onboarding literature, Lominger Limited Inc., a subsidiary of Korn/Ferry International
Fielding R (1999) Shared leadership in the Apache project. Commun ACM 42(4):42–43
Fershtman C, Gandal N (2011) Direct and indirect knowledge spillovers: the social network of open-source projects. RAND J Econ 42(1):70–91
Farzan R, DiMicco JM, Millen DR, Dugan C, Geyer W, Brownholtz EA (2008) Results from deploying a participation incentive mechanism within the enterprise. In: CHI. ACM, pp 563–572
German DM (2003) The GNOME project: a case study of open source, global software development. Softw Process: Improv Pract 8(4):201–215
Grant S, Betts B (2013) Encouraging user behaviour with achievements: an empirical study. In: MSR. IEEE, pp 65–68
Goeminne M, Mens T (2013) A comparison of identity merge algorithms for software repositories. Sci. Comput Program 78(8):971–986
Guzzi A, Bacchelli A, Lanza M, Pinzger M, van Deursen A (2013) Communication in open source software development mailing lists. In: MSR. IEEE, pp 277–286
Hertel G, Niedner S, Herrmann S (2003) Motivation of software developers in Open Source projects: an internet-based survey of contributors to the linux kernel. Res Policy 32(7):1159–1177
Herraiz I, Robles G, Amor J, Romera T, González Barahona J (2006) The processes of joining in global distributed software projects. In: International workshop on global software development for the practitioner. ACM, pp 27–33
Jensen C, Scacchi W (2007) Role migration and advancement processes in OSSD projects: a comparative case study. In: ICSE. IEEE, pp 364–374
Jureczko M, Madeyski L (2010) Towards identifying software project clusters with regard to defect prediction. In: Proceedings of the 6th international conference on predictive models in software engineering. ACM, p 9
Jureczko M, Spinellis D (2010) Using object-oriented design metrics to predict software defects. Models and Methods of System Dependability. Oficyna Wydawnicza Politechniki Wrocławskiej, pp 69–81
Kogut B, Metiu A (2001) Open-source software development and distributed innovation. Oxf Rev Econ Policy 17(2):248–264
Krogh G, Hippel E (2006) The promise of research on open source software. Manag Sci 52(7):975–983
Kouters E, Vasilescu B, Serebrenik A, van den Brand MGJ (2012) Who’s who in GNOME: using LSA to merge software repository identities. In: ICSM. IEEE, pp 592–595
Long Y, Siau K (2007) Social network structures in open source software development teams. J Database Manag (JDM) 18(2):25–40
Lucas RE, Diener E, Grob A, Suh EM, Shao L (2000) Cross-cultural evidence for the fundamental features of extraversion. J Pers Soc Psychol 79(3):452
Mockus A, Fielding RT, Herbsleb JD (2002) Two case studies of open source software development: Apache and Mozilla. ACM Trans Softw Eng Methodol (TOSEM) 11(3):309–346
Mann HB (1945) Nonparametric tests against trend. Econometrica: J Econ Soc:245–259
Nakakoji K, Yamamoto Y, Nishinaka Y, Kishida K, Ye Y (2002) Evolution patterns of open-source software systems and communities. In: IWPSE. ACM, pp 76–85
Newman M, Forrest S, Balthrop J (2002) Email networks and the spread of computer viruses. Phys Rev E 66(3):035101(R):1–4
Posnett D, Filkov V, Devanbu P (2011) Ecological inference in empirical software engineering. In: Proceedings of the 2011 26th IEEE/ACM international conference on automated software engineering. IEEE Computer Society, pp 362–371
Qureshi I, Fang Y (2011) Socialization in open source software projects: a growth mixture modeling approach. Organ Res Methods 14(1):208–238
Robles G, Gonzalez-Barahona JM (2006) Contributor turnover in libre software projects. In: Open Source Systems. Springer, pp 273–286
Roberts J, Hann I, Slaughter S (2006) Understanding the motivations, participation, and performance of open source software developers: a longitudinal study of the Apache projects. Manag Sci 52(7):984–999
Raymond E (1999) The cathedral and the bazaar. Knowl, Technol & Policy 12(3):23–49
Rahman F, Posnett D, Devanbu P (2012) Recalling the imprecision of cross-project defect prediction. In: Proceedings of the ACM SIGSOFT 20th international symposium on the foundations of software engineering. ACM, p 61
Rahman F, Posnett D, Herraiz I, Devanbu P (2013) Sample size vs. bias in defect prediction. In: Proceedings of the 2013 9th joint meeting on foundations of software engineering. ACM, pp 147–157
Sinha V, Mani S, Sinha S (2011) Entering the circle of trust: developer initiation as committers in open-source projects. In: MSR. ACM, pp 133–142
Stewart K, Gosain S (2001) An exploratory study of ideology and trust in open source development groups. In: ICIS. ACM, pp 1–6
Scacchi W (2007) Free/Open source software development: Recent research results and methods. Adv Comput 69:243–295
Shibuya B, Tamai T (2009) Understanding the process of participating in open source communities. In: International workshop on emerging trends in free/libre/open source software research and development. IEEE, pp 1–6
Schultz W (2006) Behavioral theories and the neurophysiology of reward. Annu Rev Psychol 57:87–115
Spencer D (2009) Card sorting: Designing usable categories. Rosenfeld Media
Von Krogh G, Spaeth S, Lakhani K (2003) Community, joining, and specialization in open source software innovation: a case study. Res Policy 32(7):1217–1241
Vasilescu B, Serebrenik A, Goeminne M, Mens T (2013) On the variation and specialisation of workload—a case study of the GNOME ecosystem community. Empir Softw Eng 1–54
Vasilescu B, Serebrenik A, Devanbu PT, Filkov V (2014) How social Q&A sites are changing knowledge sharing in open source software communities. In: CSCW. ACM, pp 342–354
Vuong Q (1989) Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica: J Econ Soc:307–333
Ye Y, Kishida K (2003) Toward an understanding of the motivation of open source software developers. In: ICSE. IEEE, pp 419–429
Zhou M, Mockus A (2012) What make long term contributors: willingness and opportunity in OSS community. In: ICSE. IEEE, pp 518–528
Acknowledgements
All authors gratefully acknowledge support from the Air Force Office of Scientific Research, award FA955-11-1-0246. Vasilescu gratefully acknowledges support from the Dutch Science Foundation (NWO), grant NWO 600.065.120.10N235. Part of this research was carried out during Vasilescu’s visits at UC Davis.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Yann-Gaël Guéhéneuc and Tom Mens
Rights and permissions
About this article
Cite this article
Gharehyazie, M., Posnett, D., Vasilescu, B. et al. Developer initiation and social interactions in OSS: A case study of the Apache Software Foundation. Empir Software Eng 20, 1318–1353 (2015). https://doi.org/10.1007/s10664-014-9332-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-014-9332-x