[go: up one dir, main page]

Academia.eduAcademia.edu
Planet Four: Probing Springtime Winds on Mars by Mapping the Southern Polar CO2 Jet Deposits es K.-Michael Ayea,∗, Megan E. Schwambb,c,d,e , Ganna Portyankinaa , Candice J. Hansenf , Adam McMasterh , Grant R.M. Millerh , Brian Carstenseni , Christopher Snyderi , Michael Parrishi , Stuart Lynni , Chuhong Maic,g , David Milleri , Robert J. Simpsonh , Arfon M. Smithi,j a Laboratory for Atmospheric and Space Physics, University of Colorado at Boulder, Boulder, CO 80303, USA Observatory, Northern Operations Center, 670 North A’ohoku Place, Hilo, HI 96720, USA c Institute for Astronomy and Astrophysics, Academia Sinica; 11F AS/NTU, National Taiwan University, 1 Roosevelt Rd., Sec. 4, Taipei 10617, Taiwan d Yale Center for Astronomy and Astrophysics, Yale University,P.O. Box 208121, New Haven, CT 06520, USA e Department of Physics, Yale University, New Haven, CT 06511, USA f Planetary Science Institute, 1700 E. Fort Lowell, Suite 106, Tucson, AZ 85719, USA g School of Earth and Space Exploration, Arizona State University, Tempe, AZ 85287, USA h Oxford Astrophysics, Denys Wilkinson Building, Keble Road, Oxford OX1 3RH, UK i Adler Planetarium, 1300 S. Lake Shore Drive, Chicago, IL 60605, USA j Space Telescope Science Institute, 3700 San Martin Drive, Baltimore, MD 21218, USA Abstract n Pr b Gemini A rti cl ei The springtime sublimation process of Mars’ southern seasonal polar CO2 ice cap features dark fan-shaped deposits appearing on the top of the thawing ice sheet. The fan material likely originates from the surface below the ice sheet, brought up via CO2 jets breaking through the seasonal ice cap. Once the dust and dirt is released into the atmosphere, the material may be blown by the surface winds into the dark streaks visible from orbit. The location, size and direction of these fans record a number of parameters important to quantifying seasonal winds and sublimation activity, the most important agent of geological change extant on Mars. We present results of a systematic mapping of these south polar seasonal fans with the Planet Four online citizen science project. Planet Four enlists the general public to map the shapes, directions, and sizes of the seasonal fans visible in orbital images. Over 80,000 volunteers have contributed to the Planet Four project, reviewing 221 images, from Mars Reconnaissance Orbiter’s HiRISE (High Resolution Imaging Science Experiment) camera, taken in southern spring during Mars Years 29 and 30. We provide an overview of Planet Four and detail the processes of combining multiple volunteer assessments together to generate a high fidelity catalog of ∼400,000 south polar seasonal fans. We present the results from analyzing the wind directions at several locations monitored by HiRISE over two Mars years, providing new insights into polar surface winds. Keywords: Mars, atmosphere, Mars, polar caps, Mars, surface, Mars, polar geology ∗ Corresponding author Email address: michael.aye@lasp.colorado.edu (K.-Michael Aye) Preprint submitted to Elsevier Journal September 6, 2018 1. Introduction Pr es Mars has a predominantly CO2 atmosphere with pressure levels buffered by seasonal CO2 polar caps [Leighton and Murray, 1966]. In the winter atmospheric CO2 falls as snow or condenses directly onto the surface, forming a seasonal ice layer with a thickness of up to 1 m, depending on the latitude. In the spring the south polar region of Mars exhibits a host of exotic phenomena associated with sublimation of the seasonal CO2 polar cap, and sublimation winds [Smith et al., 2001] contribute to atmospheric circulation. In the south polar region images from the Mars Reconnaissance Orbiter (MRO) High Resolution Imaging Science Experiment (HiRISE, McEwen et al. [2007]) document activity best described by the “Kieffer” model [Hansen et al., 2010; Kieffer, 2007; Piqueux et al., 2003a]: 1. Over the winter CO2 anneals to form a translucent slab of impermeable ice. Penetration of sunlight through the CO2 ice, which warms the ground below, results in basal sublimation of the ice. n 2. The laboratory measurements done by Hansen [2005] show that up to 70 % of the solar energy that reaches the top surface of a 1 m thick slab layer can be transmitted through it. Recent laboratory experiments by Kaufmann and Hagermann [2016] were able to trigger dust eruptions from a layer of dust inside a CO2 ice slab under Martian conditions, lending further credence to the proposed CO2 jet and fan production model. cl ei 3. Trapped gas escapes through ruptures in the ice, eroding and entraining material from the surface below [de Villiers et al., 2012]. 4. When this dust-laden gas is expelled into the atmosphere the dust settles in fan-shaped deposits on the top of the ice in directions oriented by the ambient wind, as shown in Figure 1 [Thomas et al., 2010, 2011]. 5. When the layer of seasonal ice sublimates in summer, the fans fade, as the material mostly blends back into the surface [Hansen et al., 2010]. 6. The compressed CO2 gas streams of the jets are believed to erode the surface, carving uniquely Martian spidery channels originally identified in images from the Mars Orbiter Camera [Piqueux et al., 2003b], now referred to as araneiforms [Hansen et al., 2010]. A rti The number, time history, area covered and changes in direction of the fans provide a wealth of information on the spring sublimation process and spring winds. Apart from few wind direction estimations from remotely observed dunes [Ewing et al., 2010] and surface rover wind measurements [Greeley et al., 2006; Newman et al., 2017], no wide spread wind measurements exist for Mars. The science goals enabled by cataloging fan measurements fall into two categories: 1. Enhance our understanding of spring winds and provide constraints for global and mesoscale circulation models. The length, width, and direction of these fans are snapshots in time of the local wind direction. Changes in the orientation of the fans over time records changes in wind direction. These markers can be compared to predictions from global and mesoscale 2 circulation models (e.g. Smith et al. [2015]) to improve our understanding of Mars’ weather in the polar regions. Dust injected into the atmosphere can be estimated. Pr es 2. Extend our understanding of the sublimation process and its efficacy as an agent of change on the Martian surface. The number of fans as a function of time record sublimation activity while the overlying ice thickness and insolation change during the season. The areal coverage of the fans allows us (with reasonable assumptions about particle size) to estimate the amount of material eroded from the surface on seasonal timescales. Inter-annual variability and the relationship of timing of seasonal activity to global dust storms can be quantified with this data-set (These are topics of future papers). A rti cl ei n Although the value of this data-set is clear, the sheer number of fans (on the order of hundreds of thousands) present in HiRISE images from multiple locations and times observed over many Mars years has proven to be a daunting data-set to catalog. Attempts at developing automated detection algorithms have been unsuccessful at identifying the locations and shapes of these seasonal fans in images from orbit in a reliable fashion [Aye et al., 2010]. However, there is an increasing interest to use the outcomes of Citizen Science projects as training data for neural networks (e.g. Alger et al. [2018]; Banerji et al. [2010]; Bird et al. [2018]; Bowley et al. [2018]; Nguyen et al. [2018]; Peng et al. [2018]), hence we believe that these two lines of research will become strongly complimentary in the near future. The task of mapping the dark fans is simply pattern recognition, and the human brain is ideally suited for this task, easily capable of spotting and outlining these features. With the advent of the Internet, tens of thousands of people across the globe can be enlisted to assist scientists with tasks that are impossible to automate. This citizen science or crowd-sourcing approach, where independent assessments from multiple non-expert classifiers are combined, has become an established technique as the data volumes have continued to grow. This method has been applied to nearly all areas in astronomy and planetary science [Marshall et al., 2014] (see reference therein) including galaxy morphology [Lintott et al., 2008; Willett et al., 2013], identification of planet transits [Fischer et al., 2012; Schwamb et al., 2012], crater counting [Bugiolacchi et al., 2016; Robbins et al., 2014] and to a sister project of the here presented efforts, Planet Four: Terrains [Schwamb et al., 2017b]. In collaboration with the Zooniverse1 [Fortson et al., 2012; Lintott et al., 2011], the largest collection of online citizen science projects, we have developed Planet Four2 , a web portal to enlist the general public to identify and map the seasonal fans in HiRISE images of Mars’ polar regions. In this paper we present the first results from the Planet Four project, a catalog of seasonal fans from two Mars years, MY 29 and 30, of HiRISE monitoring of the Martian South Polar region. In Section 2, we provide an overview of the HiRISE South Pole Seasonal Processes Monitoring Campaign and the specific HiRISE observations used in this study. In Section 3, we present the Planet Four project and the online classification interface. Section 4 details the process for assessing and combining the volunteer classifications to create a catalog of seasonal features. In 1 http://www.zooniverse.org 2 http://www.planetfour.org 3 es Pr n cl ei Figure 1: Subsection of HiRISE image ESP_011960_0925, taken at (LAT, LON) −87.303°, 167.970°; Ls 209.1°. The image is approximately 321.4 m long and 416.6 m wide Section 5 we examine our catalog’s validity by comparing results between volunteers and science team members. Section 6 presents general statistical results of the catalog, and finally, we use the catalog for an initial probing into regional winds in Section 7. We summarize our conclusions in Section 8. All place names referred to in this paper are informal and not approved by the International Astronomical Union. Full machine-readable versions of the catalogs and tables presented in this paper are also available from https://www.planetfour.org/results. 2. HiRISE Instrument and Seasonal Processes Monitoring Campaign A rti The Mars Reconnaissance Orbiter (MRO) has the ability to turn off nadir to target a specific location. In its inclined orbit there are numerous opportunities to achieve repeat coverage in the polar region. In order to study seasonal processes the HiRISE team selected a limited number of regions of interest (ROIs) in the Martian south polar region to image throughout the spring season. Time is defined on Mars by the orbital longitude Ls , where southern spring begins at Ls =180°. Originally, the HiRISE monitoring campaigns were numbered by their ordinal number of seasons the MRO mission had been observing Mars. This work focuses on the observations from seasons 2 and 3 which have more regular repeat HiRISE imaging of ROIs over multiple years, compared to season 1 HiRISE monitoring campaign. To be able to compare with other missions and modeling, we also identify our data using the convention of Martian years, established by Clancy et al. [2000] and Piqueux et al. [2015], where Mars Years 29 and 30, also written as MY29 4 A rti cl ei n Pr es and MY30, correspond to HiRISE seasons 2 and 3. Every day, citizen scientists are making more fan measurements for later Mars years and the catalog continues to grow. The longer timespan covered by the catalog will be discussed in future paper(s). Figures 4 and 5 provide an overview of the observed locations and times in solar longitudes of the HiRISE data used in this work. Table 1 lists the ROIs selected for analysis using Planet Four. 221 high quality images from southern spring season 2 and 3 (i.e. MY 29 and 30) were selected for analysis on Planet Four (see Table 2). The reduced HiRISE products were obtained from the National Aeronautics and Space Administration’s (NASA) Planetary Data System (PDS) HiRISE PDS Data Node3 . HiRISE is a pushbroom imager. It has ten 2048-pixel detectors in the cross-track direction, which covers ∼6 km at the spacecraft altitude of 300 km (MRO is in an elliptical 255 km by 320 km orbit). An image is built up in the along-track dimension as the spacecraft travels in its orbit, with a ground velocity of ∼3 km s−1 . A typical size image has ∼60,000 pixels along-track, thus covers a (6 × 18) km2 area. Color is available in the center 20 % of the image. A full description of the camera is found in McEwen et al. [2007]. It is generally easier to identify the fans in the color portion of the image, so only the ∼1 km wide color (RGB) sub-image was used for the Planet Four image set. A visitor to the Planet Four website is presented with a sub-image from a RGB non-mapped projected HiRISE image. Each HiRISE frame (typically several hundred megabytes in size) is divided into 840 × 648 pixel subimages that we will refer to as “tiles”. To avoid edge effects, the tiles are generated such that there is a 100-pixel overlap with the neighboring tiles. We avoid showing volunteers tiles where part or most of the tile is blank. Due to the variable length and width of HiRISE images, there is typically a small region on the right and bottom edges of the non-map projected HiRISE image that cannot be made into a full-sized tile and thus is not searched for seasonal features with Planet Four. Pixel sampling scales per tile are typically 24.7 cm/pixel when HiRISE is in 1 × 1 binning mode, and the seasons 2 and 3 observations span binning resolutions of 1 × 1 to 4 × 4. For the seasons 2 and 3 monitoring campaign, a HiRISE image is associated with 36 to 635 tiles (see Table 2). For the analysis presented here 23,723 tiles derived from 129 full frame HiRISE season 2 monitoring images and 19,181 tiles derived from 92 season 3 HiRISE images were reviewed by Planet Four volunteers. A characteristic sample of Planet Four tiles is presented in Figures 2 and 3. 3 http://hirise-pds.lpl.arizona.edu/PDS/ 5 -73.53 -74.22 -81.38 -81.46 -81.68 -81.80 -81.93 -81.9 -82.2 -82.3 -82.5 -82.69 -83.2 -84.82 -85.0 -85.02 -85.13 -85.18 -85.4 -86.25 -86.39 -86.8 -86.98 -86.99 -87.0 -87.0 -87.0 -87.3 339.5 168.5 295.8 296.3 66.3 76.1 60.4 4.8 225.2 306 80.0 273.1 158.4 65.7 95.0 259.0 180.7 92.0 103.9 99.0 99.0 178.0 169.7 99.1 72.3 86.4 127.3 167.8 Binghamton Caterpillar Inca City Inca City Ridges Potsdam Starburst Albany Buenos Aires Wellington Taichung Buffalo Cortland Rochester Giza Schenectady Troy Ithaca Geneseo Macclesfield Manhattan Cracks Manhattan Classic Písaq Atka Manhattan Frontinella Halifax Oswego edge Bilbao Portsmouth # of Images MY 30 2 1 7 7 7 7 5 7 2 1 2 1 4 11 1 1 10 0 7 1 8 3 3 5 3 6 7 5 0 0 7 8 9 3 0 7 0 0 0 0 0 7 0 0 6 1 7 5 9 1 0 3 0 10 3 6 cl ei rti A # of Images MY 29 es Informal Name Pr Longitude (degrees East) n Latitude (degrees) Table 1: Regions of interest studied with Planet Four that were monitored during both seasons 2 (Mars Year 29) and 3 (Mars Year 30) HiRISE Southern Seasonal Processes Campaign. A full list of the images is available as supplemental data in the file P4_catalog_v1.0_metadata.csv The Latitude and Longitude values are the mean value over the center latitudes and longitudes of the respective HiRISE observations. All informal names are internal designations used by the Planet Four team and not approved by the International Astronomical Union. 6 es Pr n cl ei rti A Figure 2: Randomly selected sample of Planet Four tiles characteristic of the season 2 and season 3 HiRISE monitoring campaign. Each tile has 840 × 648 pixels, but its ground resolution varies with HiRISE binning modes. This is reflected in the map_scale column of the Planet Four catalog files. 7 es Pr n cl ei rti A Figure 3: Randomly selected sample of Planet Four tiles characteristic of the season 2 and season 3 HiRISE monitoring campaign. Each tile has 840 × 648 pixels, but its ground resolution varies with HiRISE binning modes. This is reflected in the map_scale column of the Planet Four catalog files. 8 A es Pr n rti cl ei Figure 4: Map overview of the regions of interest for the seasonal monitoring campaign of HiRISE. For readability, the following regions are shown as cyan-colored unlabeled dots: Inca City Ridges, Schenectady, Troy, Manhattan Cracks, Manhattan Classic, Atka, Halifax, Oswego edge. Figure 5: Temporal and latitude coverage for the season 2 and season 3 HiRISE monitoring campaign observations reviewed on Planet Four. 9 Longitude [deg east] Ls [deg] Start Time ESP_011296_0975 ESP_011341_0980 ESP_011348_0950 ESP_011350_0945 ESP_011351_0945 ESP_011370_0980 ESP_011394_0935 ESP_011403_0945 ESP_011404_0945 ESP_011406_0945 ESP_011407_0945 ESP_011408_0930 ESP_011413_0970 ESP_011420_0930 ESP_011422_0930 ESP_011431_0930 ESP_011447_0950 ESP_011448_0950 -82.197 -81.797 -85.043 -85.216 -85.216 -81.925 -86.392 -85.239 -85.236 -85.409 -85.407 -87.019 -82.699 -87.009 -87.041 -86.842 -84.805 -84.806 225.253 76.13 259.094 181.415 181.548 4.813 99.068 181.038 181.105 103.924 103.983 86.559 273.129 127.317 72.356 178.244 65.713 65.772 178.8 180.8 181.1 181.2 181.2 182.1 183.1 183.5 183.6 183.7 183.7 183.8 184.0 184.3 184.4 184.8 185.5 185.6 2008-12-23 2008-12-27 2008-12-27 2008-12-27 2008-12-27 2008-12-29 2008-12-31 2009-01-01 2009-01-01 2009-01-01 2009-01-01 2009-01-01 2009-01-01 2009-01-02 2009-01-02 2009-01-03 2009-01-04 2009-01-04 110.6 110.2 123.6 99.7 128.0 110.6 139.4 106.5 134.1 111.3 138.8 148.9 112.8 157.3 157.0 148.6 113.0 138.8 91 126 91 126 91 126 72 164 91 126 91 59 108 54 54 54 218 59 n cl ei Table 2: Partial table of used HiRISE observations to indicate spatial and temporal coverage. Full table published in the online version. The center coordinates for all HiRISE pointings used in this study. Latitudes are planeto-centric and the given north azimuth angle is for the non-map-projected data that went into the Planet Four system. 3. Planet Four Here we describe the Planet Four classification interface and the information generated by volunteers visiting the Planet Four website. 3.1. Classification Web Interface Planet Four volunteers are asked to identify and outline fans in the presented tiles. Sometimes the fan has an indeterminate direction, in which case we call them “blotches”. Although less useful for wind regime studies the blotches are sites where the ice has ruptured and released material, so they are important to studying the sublimation process of the polar CO2 ice sheet. Thus, volunteers are asked to identify and mark blotches as well. Positions, orientations, and sizes of fans and blotches are obtained via a web interface (see Figure 6) built upon the Zooniverse’s Application Programming Interface (API), which communicates with their custom built Ouroboros web platform (described in Appendix A). Each tile is assessed by approximately 30–100 independent reviewers. To ensure reviewers have no prior information that may influence their judgment, tiles are randomly served to the classifier, and no identifying information about the parent HiRISE image is presented in the Planet Four web interface. The volunteer is blind to the location on the South Pole, time of season the observation was taken, and responses from other classifiers while 10 rti A North # of Azimuth Tiles es Latitude [deg] Pr Observation ID es reviewing a given tile. Planet Four was launched originally in English; later on the websites, classification interface, and help material have also been translated into several languages , including traditional and simplified character Chinese, German, and Magyar (Hungarian). For the analyses presented here, all Planet Four classifications are treated the same, regardless of what language the volunteer was using in the classification web interface. Pr 3.1.1. Tutorial First time visitors to the Plant Four website are presented with a short inline interactive tutorial that explains the task and guides the classifier on how to use the marking tools. Additional training material is also available elsewhere on the site. The tutorial is shown only once for those classifiers using the Planet Four web interface logged-in with a registered Zooniverse account. Volunteers using the site in the non-logged-in mode, are presented with the tutorial each time they visit the Planet Four website. Other than the frequency of the tutorial appearing, the user experience on Planet Four, including the tutorial content, are exactly the same for logged-in or non-logged in volunteers. A rti cl ei n 3.1.2. Marking Tools Fans and blotches are drawn by selecting the appropriate tool in the classification interface (see Figure 6), clicking on the tile displayed, and dragging to resize the marker to the appropriate shape and orientation. The fan tool generates a triangle with a rounded base with the user controlling the endpoint of the fan. The default opening angle for the fan marker is set to 5°. The blotch tool simply produces an ellipse with the user controlling the size and orientation of the major axis. For blotches, the default length of the minor axis is 0.75 times the pixel length of the major axis drawn. Once a blotch or fan marking has been made, a classifier can edit the initial parameters by manipulating handles on the marker. For blotches, the length of the major and minor axes and rotation can be adjusted. For fans, the opening angle, orientation, and length can be modified. If only a single mouse click is made on the interface, than the minimum sized fan or blotch marker is produced: a fan with a length of 10 pixels and an opening angle of 1° or an ellipse with both axes equal to 10 pixels. Additionally, there is an ‘Interesting Feature’ tool available for volunteers to highlight the position of anything that they deem worth review by the Planet Four Science Team. The Interesting Feature marker is not resizable. All markers drawn in the web interface can be repositioned or removed by the classifier. 11 es Pr n cl ei rti A Figure 6: The fan (above) and blotch (below) marker on the Planet Four tutorial image. Black circles and diamonds are the marker handles that can be used to adjust the shape and orientation in the web classification interface. The “x” is used to delete the marker. 12 cl ei n Pr es 3.2. Classification Database Once the volunteer is done making markings, if any, and hits the ‘Finished’ button, the classification (which we define as the sum total of all the markings or lack of markings made by the volunteer) is submitted to the Ouroboros API to be saved to a database. At this point, the classifier can move on to view the next tile by hitting the ‘Next’ button or can choose instead to enter the Planet Four discussion tool (discussed in further detail in Section 3.3). Once the classification has been submitted, it cannot be revised. For blotches, the center position, rotation angle, and pixel lengths of the major and minor axes of the ellipse are recorded. For fans, the starting position, distance in pixels from the starting point to the end of the fan, opening angle, and rotation angle are saved to the database. For interesting features, only the pixel location is stored. If no features are marked, the database records the classification as a non-marking. A tile identifier and timestamp for each classification is also stored in the database. If the volunteer is logged in with a registered Zooniverse account, the classifications are tracked in the database via the associated username. For non-logged-in classifications, a unique session id is generated and used to link the classifications completed by a given IP address and web browser. The non-logged-in identifier does not exactly correspond one-to-one to a unique individual. If a person classifiers non-logged-in and changes their IP address, their new classifications would be stored under a different identifier. Additionally, if a volunteer initially participates as a non-loggedin classifier on Planet Four and then registers for a Zooniverse account, the previous classifications stored in the database are not linked to the Zooniverse username and remain associated with the unique non-logged-in session identifier. We note there are occasional spurious or duplicate entries stored in the classification database, typically due to a glitch in the classifiers’ browser or a minor bug in the Ourborous framework. These entries compose a very small percentage of the total volunteer classifications. They are easily identified and removed from the analysis presented here. Further details are provided in Appendix B. Additionally the Planet Four classification interface originally recorded a different angle than the intended spread angle from the fan marking tool. This was identified and subsequently fixed in the software. The true spread angle of the fan marker drawn by the volunteers is recoverable from the values stored recorded in the database, and we have adjusted the classifications effected. A rti 3.3. Talk Discussion Tool Associated with the Planet Four classification interface is a dedicated object-orientated discussion tool known as “Talk”4 . Each Planet Four tile assessed on the main classification interface has a dedicated page on the Planet Four Talk website. Volunteers can access these pages directly through the classification interface after submitting their classification. With Talk, volunteers can write comments, add searchable Twitter-like hash tags, create longer side discussions, and group similar tiles together in collections. For the analysis presented here, we focus strictly on the volunteer markings from the main user interface, and do not include a complete analysis of the data from the Talk tool. 4 http://talk.planetfour.org 13 es Pr n Figure 7: Distribution of the number of Planet Four classifications for Season 2 (MY29) and Season 3 (MY30) tiles with a bin size of 5. The distribution peaks at the two different retirement values of 100 and 30. Due to performance issues in the webserver’s queueing system, the retirement values were at times not enforced, leading to the spread-out distributions at values higher than the retirement values. A rti cl ei 3.4. Site History Planet Four was publicly launched on 2013 Jan 8 as part of the British Broadcasting Corporation’s (BBC) Stargazing Live, three nights of live astronomy programing (2013 Jan 8–10) on BBC Two in the United Kingdom. Review of Season 2 and 3 tiles span from January 2013 to March 2015 with 9,809,637 classifications produced in total. The majority of classifications for Seasons 2 and 3 were obtained during the BBC Stargazing period, but subsequently data from HiRISE’s other seasonal monitoring campaigns were mixed with the Season 2 and Season 3 classifications. The results from data outside season 2 and 3 which are still in the process of being reviewed on the Planet Four website will be the topic of subsequent publications. Figure 7 plots the distribution of classifications per tile for Seasons 2 and 3. Due to the high classification rate at launch, tiles were set to retire from rotation in the web interface after 100 independent assessments (counting duplicates) to ensure that the project would continue to serve data over the Stargazing period. Over time the classification rate dropped significantly from launch, and on 2013 Dec 9 the retirement threshold for a tile was lowered to a more reasonable — and statistically acceptable — value of 30 to better accommodate the actual work rate on Planet Four. This value is similar to the image retirement threshold that was used by the Zooniverse’s Milky Way Project [Simpson et al., 2012], which enlists the general public in a similar task, drawing circles on space-based infrared images to identify the shape and size of star formation bubbles. 3.5. User Statistics 36,433 registered volunteers and 48,094 non-logged-in sessions have classified at least one tile in our MY29/30 data-set. Volunteers made in total 9,461,062 classifications with a median 14 A rti cl ei n Pr es of 7 and average of 41 classifications per registered volunteer/non-logged-in session. The highest number of different classifications (i.e. submitted Planet Four tiles) by the same volunteer was 31,808. After clean-up, Planet Four volunteers drew a combined 3,460,056 blotches, 2,694,415 fans, and 805,903 interesting features. Figure 8 shows the distribution of volunteer classifications for Seasons 2 and 3 tiles combined. Individual registered volunteers (median of 14 and average of 69 classifications per user) tend to contribute slightly more classifications than a individual non-logged in session (median of 4 and average of 21 classifications per session). A given volunteer/session reviews only a small percentage of the entire sample of HiRISE tiles. Only 15 % of classifiers (12,483 registered volunteers and non-logged-in sessions) have contributed more than 50 classifications. Most volunteers contribute a few classifications of Planet Four tiles before leaving the site. This is a typical response for web-based projects [Crowston and Fagnot, 2008; Zachte, 2012] and is similar to the volunteer behavior found on other Zooniverse projects [Sauermann and Franzoni, 2015]. 15 A rti cl ei b) n Pr es a) Figure 8: Distribution of volunteer classifications. Figure a shows the combined distribution tallied together for both logged-in and non-logged in sessions. Figure b shows the volunteer classification count individually for registered and non-logged volunteers. Both histograms use a bin size of 2. 16 4. Data reduction cl ei n Pr es In order to create fan and blotch object catalogs from the Planet Four markings, a reduction pipeline was implemented, for which the code is open source and made available5 . The pipeline is based on the Python programming language, interfacing also to the US Geological Survey’s (USGS) Integrated Software for Imagers and Spectrometers (ISIS) [Anderson et al., 2004; Becker et al., 2007], and making use of the “scikit-learn” package for machine-learning related tasks [Pedregosa et al., 2011]. This data reduction pipeline has five main conceptual stages (see Fig. 9): Cleanup, where the Planet Four classification data is cleaned, normalized and converted to a binary database (Section 4.1), Clustering, where the markings of the many different volunteers are being combined into, ideally, one resulting average object (Section 4.2), Combination, where we combine fans and blotches markings that seem to address the same visible object in the image into a meta-object for further processing during the next stage (Section 4.3), Thresholding, where a cut on the required number of volunteers that voted for either fan or blotch will decide if the previously created meta-object should be considered a fan or a blotch (Section 4.3.1), and finally Ground Projection, where we project the HiRISE image pixel coordinates of the resulting fan and blotch markings into latitude and longitude coordinates on Mars (Section 4.4). pipeline is located at https://github.com/michaelaye/planet4. A rti 5 The Database Cleanup Clustering per fans/blotches Fan & Blotch overlapping? yes Create metaobject with marking weights no Final fan/blotch catalog Ground Projection Thresholding decides between fan and blotch Figure 9: Overview of conceptual steps of the Planet Four data reduction pipeline. 17 Pr es 4.1. Database Cleanup After the removal of the tutorial data (see 3.1.1), and a first cleaning for spurious, incomplete and duplicate classification database entries (see Section Appendix B), we normalize all angles from the Planet Four classification interface, and finally produce a binary database in the format of HDF5 (Hierarchical Data Format, version 5) for the remainder of the data processing. Normalizing of angles is required because the Planet Four system records blotches with an angular range from -180 to 180 while ellipses possess a degree-2 rotational symmetry. This means only the range of 0 to 180 degrees is required to fully describe blotches, once the radii are sorted in a consistent way (semi-major axis first). Volunteers randomly start to draw the ellipses required to mark blotches either from the semi-minor axis or the semi-major axis, making it error-prone to cluster on these parameters without normalization. The cleaned raw Planet Four classifications as used by this work’s analyses are provided as supplemental data to this work in the file P4_catalog_v1.0_raw_ classifications.csv. Further details about the format of the raw classifications are described in Appendix C. A rti cl ei n 4.2. Clustering We identify fans and blotches by combining together the multiple volunteer assessments from each Planet Four tile. To identify and precisely locate the marked features from the multiple classifications performed by many (between 30–100, see Appendix A) volunteers per Planet Four tile, we perform a clustering analysis on the data. Figure 10 shows an example of fan markings for a Planet Four tile. After having evaluated several different clustering algorithms, we have identified the Density-based Spatial Clustering of Applications with Noise (DBSCAN) clustering algorithm of Ester et al. [1996] as the most appropriate one for our application. DBSCAN has the advantage of not requiring the number of expected clusters as input, instead it is controlled by two input parameters describing the minimum number of members of a cluster (min_samples) and the maximum distance for a data point to be included into a cluster (epsilon). (Details on how we determine these parameters are described in Section 4.2.1.) We set up our clustering pipeline using the DBSCAN implementation in the scikit-learn Python library [Pedregosa et al., 2011]. All volunteer responses are treated the same with equal weight in the clustering algorithm. Due to the differences in the classification interface for marking fans and ellipse-shaped blotches — fans are drawn from a base point vs blotches drawn from the center — the fans and blotch markings are clustered separately at this stage, and require their own set of clustering parameters. In a first stage, we cluster the data for Planet Four tiles each on the (x,y)-pixel-coordinates of the base point of fans and of the center for blotches (see Fig. 12 for a visual description of the available coordinates of the markings.). Figure 13 shows the result of clustering in two dimensions of the x and y base coordinates of the fan markings, using a multi-step approach as shown in Fig. 11, as described below. Once the clusters for a given set of parameters (see Section 4.2.1 for details on the parameter tuning) have been defined, the original marking data for each cluster members are averaged to create one average marking object per cluster, including average directions for fan objects, e.g. in Fig. 13. The number of markings that went into the creation of the averaged object is stored for later. After having clustered both fans and blotches on their base and center coordinates respectively, we apply a second stage of clustering on the markings. For fan deposits, the major objective of this 18 es Pr cl ei n Figure 10: Fan markings for Planet Four tile APF00001cl of HiRISE image ESP_012322_0985. Left: The cut-out tile that is shown to the Planet Four volunteers. Right: 51 different users have classified this image. The colors cycle through randomly for the markings of different users. With such a large number of different volunteers classifying, the “sensitivity” for detection is increased, as notable by a few markings that outline even the smallest potential dark deposit candidates. However, when the “crowd” does not agree with these, i.e. if the potential cluster does not reach the min_samples number of required members, the clustering pipeline discards these entries, as shown in Fig. 13. A rti Fan clustering Blotch clustering base coords within 10 px center coords within 10 px center coords within 25 px angle within 20º rad_1 & rad_2 within 30 px rad_1 & rad_2 within 50 px Figure 11: The sequence of clustering steps for both fan and blotch markings. It became apparent during our studies, that fan markings show less scatter, probably due to the tool having to be placed at a clearly identifiable base point. Blotches, however, do not show a clearly identifiable center, and their outline is often less sharply defined, creating a wider distribution of marking results, especially for larger blotches. This required a second run of clustering with more relaxed cluster parameters, as described in Section 4.2.1 and in Table 3. 19 x (0,0) Pixel position Base Point (pixels) y Distance (pixels) Radius_1 (pixels) n Angle from horizontal (degrees) Pr Radius_2 (pixels) Center Point (pixels) es Spread (degrees) cl ei Figure 12: The different coordinates available in the Planet Four marking catalog are described here. Fans possess (x, y) base coordinates, an angle from horizontal for their pointing and a spread angle. Blotches possess center (x, y) coordinates, semi-major and minor axis radii and also an angle indicating their alignment towards the horizontal. rti A Angle from horizontal (degrees) Figure 13: Fans from Figure 10 for Planet Four subject ID APF00001cl after applying our clustering pipeline. Left: For direct comparison, this shows the same as Fig. 10 on the right, on page 19. Right: Results after clustering, identification of noise markings, and averaging the cluster members’ data into one object per cluster. Markings that do not become member of a cluster are defined as noise and will be discarded from further processing (shown as white dots). 20 es Pr cl ei n Figure 14: Planet Four tile APF0000de3 from HiRISE image ESP_011961_0935. It shows the prevalence and precise identification of CO2 jet deposits with multiple directions that start from the same base point, indicating multiple eruptions under different wind directions. The large fan is the second longest recorded in the catalog, with a length of approx. 368 m. A rti work is to determine the wind direction they indicate. Due to this we want to be able to distinguish between different wind directions from the same source point, i.e. multiple subsequent eruptions, where later eruptions occurred with a different prevalent wind direction. In the Planet Four help content we have emphasized that the volunteers should outline several fans if they appear to start from the same source point. This is very relevant for data like that in Fig. 14, to identify several wind directions indicated by the fans, from multiple subsequent jet eruptions. By clustering not only on the base coordinates (x, y) but also on the recorded alignment angle of the fan markings, we are able to distinguish these subsequent fan deposits with different wind directions. We have determined by reviewing the clustering results of a subset of the data that 20 degrees as a clustering value for angles enables this objective. It means that fan markings that have an alignment angles further away from each other than 20 degrees are clustered into their own subcluster, even if they start at the same base point. Blotches, on the other hand, are used for deposits that do not clearly indicate a direction, which is why we do not apply an angle clustering here. However, blotches do not show a clearly identifiable center, and their outline is often less sharply defined, creating a wider distribution of marking results, especially for larger blotches. Thus, we cluster also on the resulting ellipse radii for the blotches to ensure that we identify the statistically most common shape of the volunteer’s blotch markings. The values of the clustering parameters strongly influence the number of identified features. We therefore studied extensively, how precisely they affect our results by reviewing random subsets of the data-set, which led to the empirical determination of the clustering parameter values 21 Fans Blotches xy (base) angle (deg) xy (center) radius (px) 10 px 20 10 px 30 px NA NA 25 px 50 px es Marking Dimension Small Large Pr Table 3: Empirically determined epsilon values for the clustering pipeline. NA: Fan markings did not require a second clustering run with relaxed precision on the distance, apparently the fact that a fan requires drawing from a distinguishable starting point helped the volunteers to keep the scatter small, both in base coordinates and angle precision. that we eventually used for the catalog production. These procedures will now be discussed in the following sections (see Fig. 15 for an example of reviewing parameter values). The results of the clustering stage are then shown in the lower right (blotches) and lower middle (fans) parts of Figures 16 and 17. cl ei n 4.2.1. Cluster parameters min_ samples . As described in Section 3.2, Planet Four tiles have varying numbers of user classifications, thus the classifications for each Planet Four tile are clustered separately, with a variable requirement on the min_samples clustering parameter. More classifications for a Planet Four tile means that we have a higher “sensitivity” to smaller features (see for example Fig. 10, right), so to achieve a uniform detection efficiency, we implement a scaling factor on the required number of samples per cluster. This results both in a higher sensitivity to have seasonal fans and blotches marked and higher precision averaged objects at the end of the clustering process. In other words, the signal-to-noise ratio (SNR) is higher for a Planet Four tile that was classified by a larger number of volunteers and we adapted the clustering process to normalize for that fact. To address the variable SNR in our data, we empirically determined a scaling factor min_ samples_factor (MSF) that, multiplied with the number of classifications that contain blotch or fan markings, results in the min_samples value for the DBSCAN algorithm:  min_samples = round min_samples_factor · nmarkings , A rti with nmarkings ≤ nclassifications , the number of classifiers that have added either blotch or fan markings as classifications. The best value for MSF was empirically found to be at 0.13. For example, when a Planet Four tile has nclass = 30 classifications (our current retirement value), nclass will be 4. This value now provides the number of cluster members min_samples that is required for a cluster to be created. When a tile has 70 submissions, however, it would result in the requirement of having 9 cluster members to be deemed a real detection and to be entered into the next stage of the pipeline. This way, we are exploiting the higher sensitivity from the larger number of submitted classifications. epsilon . The second DBSCAN parameter, epsilon, describes the largest distance that two points are allowed to have, for them to be considered to be in the same cluster. The dimension for this measurement depends on what mathematical feature is currently being clustered. When 22 A rti cl ei n Pr es we cluster on the base point coordinates of fans, the central point coordinates or semi-radii of blotches, the feature space is measured in pixels, while fan angles are clustered in degrees. The size scale of the dark fans and blotches varies significantly between different regions of interest at the south pole of Mars. Trying to cluster our data with only one value of epsilon, we realized that it was not possible to simultaneously resolve small markings on the order of 20 pixels properly that were precisely positioned by the volunteers, while also clustering successfully markings of much larger deposits that could stretch more than half of the Planet Four tile that was shown to the volunteers. The spread in marking coordinates is smaller for smaller features — we think because of an increased focus to detail for smaller features —, and thus, to ensure identification of large features, we implemented a second stage of clustering with larger allowed values for epsilon. The resulting values in Table 3 were selected empirically after review of a random subset of the pipeline output. Fig. 15 shows an example parameter scan review graphic that the science team used to determine the parameter values that work best for our task. 23 A e cl rti 24 in es Pr Figure 15: This figure shows our review plots for determining the best clustering parameters for Planet Four tile ID 1cl. In this example, we review the fan clustering with a group of 2 different min_samples values, controlled by using a min_samples_factor of 0.1 and 0.13 respectively, leading to min_samples values of 5 and 7. Additionally, we are scanning the epsilon (EPS) value for small deposits with the settings 10, 20, and 30 pixels, while the EPS_LARGE value stays at 25 pixel for these runs (having no effect in this case due to the small size of markings). The upper left 3 plots are for the setting of MSF=0.1 (resulting in a min_samples value of 5), and EPS between the 10, 20, and 30 pixel values. Then, the second group with an MSF of 0.13 (resulting in min_samples=7), starts in the upper right with the fourth plot in the upper row, and continues in the lower left with the first two plots, again showing the tests for EPS values 10, 20, and 30 pixels respectively. The last two plots in the lower row provide us with what the volunteers actually marked and what they received as input for the markings, the Planet Four tile, cut out from the larger HiRISE images. The number of fans clustered varies significantly for different clustering parameter values, with n between 11 and 16. We favor the setting in the upper right plot, for identifying correctly all small center fans, while not creating an object for the small black spot at the top of the image tile. A e cl rti 25 in Pr Figure 16: This figure shows the final pipeline result of the tile from Fig. 15. Upper Left: The input tile; Upper Middle: Fan markings of the volunteers; Upper Right: Blotch markings of the volunteers; Lower Right: Blotch markings after clustering and averaging the cluster members; Lower Middle: Fan markings after clustering and averaging the cluster members; Lower Left: These are the final catalog entries. To reach this, the results from Lower Middle and Lower Right are being compared, and the higher voted markings at comparable locations win. How high that winning ratio must be to be entering the final catalog is determined by the threshold value (see Section 4.3.1). Note, how the center fans are cleanly identified and winning in the voting competition with the blotch at the same location. The opposite is true for the the small object identified at the middle left, where a red blotch marking has won against the small cyan fan. es cl ei n with nfans and nblotches the number of volunteers that marked either. The fudge value 0.01 is required to be able to make an either-or decision for the object when nfans = nblotches , flipping the switch in this close call for fans instead of blotches, due to the usefulness of fans for further scientific analysis. We determine to which markings this procedure is applied by calculating the pair-wise Euclidean distance for all clustered objects and check if clusters are within a chosen limit of 30 pixels with each other. We chose this value for allowing slightly more imprecision in the markings’ positioning as the clustering algorithm that went into creating these average, but without combining too many markings that really should be individual items. We have reviewed several hundred subsets of data and determined 30 pixels to be a good compromise on these competing tasks. If a distance pair meets the combination criterion, we use above formula to calculate P(fan) for this pair of markings. This value goes from 0 to 1 with 0 being a definite blotch when n f ans = 0 and 1 indicating a definite fan when nblotch = 0, in other words either none or all volunteers had drawn a fan or a blotch, respectively. We then create a meta-object for this pair, storing P(fan) under the name ‘vote_ratio’ in the catalog files, together with all other data for both objects. We do this to enable future users of the catalog to decide on their own how reliably a marking is required to be a fan before it shall be used as such, with its data entering a study. In other words, a specific study might require to only use the most clear fan markings, maybe with a P(fan) of larger than 0.8. Applying such a cut is called Thresholding in our pipeline, described in the next section. 4.3.1. Thresholding For concrete applications, e.g. for this publication, a scientist can now apply a cut on P(fan), that will write out the decision to a new catalog file with fans and blotches. For example, a cut on P(fan) of 0.8 would mean that all meta-objects with a value of smaller than 0.8 will be written out as the underlying blotch, while for meta-objects with a value of larger than 0.8 the stored fan will be written out. In both cases, the remaining data of the meta-object that was thresholded against will be dropped for the newly created catalog file, but it is still available for other thresholding operations as an intermediate data product. An example use case would be that a scientist wants to study the sensitivity of their research on the applied cut, for example, if we want to provide wind direction data to a mesoscale climate simulation, we might want to make sure that only the most certain directions are being used and would apply a higher cut on the meta-object value. 26 rti A nfans + 0.01 , nfans + nblotches Pr P (fan) = es 4.3. Combination When the direction of fan deposits are not very pronounced, i.e. the prevalent winds were weak at the time of the jet eruption, there is ambiguity in identifying the deposit as a fan or a blotch. This can result in a given ground source having both survived clusters of fan and blotch markings that need to be combined in a strategic way to create a final object category for the observed ground source that will be listed in the resulting object catalog. We make use of the relative frequency of which marking tool was used to create both marking clusters to identify how fan-like a source is. For example, if 5 people classified a marking as a fan, but 5 other people marked it as a blotch, we assign a fan probability P(fan) of 0.51 by applying A rti cl ei n Pr es For the catalog that we deliver with this work, we chose a simple majority threshold of 0.5, so that the catalog offers the broadest use case. Chosing simple majority means that we take a marking as a fan from the moment that at least an equal amount of volunteers have classified an object as a fan and as a blotch. Catalog files with this applied P(fan) threshold of 0.5, all intermediate data products, and instructions on how to apply a threshold for writing out new catalog files will be provided as supplementary products (see Appendix D for more details). 27 es Pr n cl ei rti A Figure 17: Three example Planet Four tile pipelines, for APF0000b0t, APF0000ops, and APF0000bk7. See Fig. 16 for a detailed description of the pipeline plotting sequence. 28 A rti cl ei n Pr es 4.4. Ground Projection For each Planet Four tile, the clustering in volunteer-drawn markings to identify seasonal sources is performed using the pixel positions of Planet Four tiles. Once the cluster dimensions and position has been identified, the source’s true location on the South Pole must be calculated. However, the HiRISE team-generated non-map projected color mosaics the Planet Four tiles are derived from do not contain the spacecraft information necessary to compute the latitude and longitude per pixel. We partially reconstruct the mosaics from the raw HiRISE image products or Experiment Data Records (EDRs) building a red filter only composite image with the necessary spacecraft information required to perform coordinate transforms. The HiRISE EDRs were obtained from the NASA’s Planetary Data System (PDS) HiRISE PDS Data Node. We developed a reduction pipeline in Python using the US Geological Survey’s (USGS) Integrated Software for Imagers and Spectrometers (ISIS)6 [Anderson et al., 2004; Becker et al., 2007] and the ISIS-3 Python wrapper Pysis7 for this purpose. We briefly summarize the steps as shown in Fig. 18 including the required ISIS-3 application names, to generate the red filter-only mosaic. We start with the center two RED filter CCDs (RED 4 and 5), each with two readout channels. All four EDR files (2 for each CCD) are read in and converted to ISIS-3 cube format, and the SPICE (Spacecraft & Planetary ephemerides, Instrument C-matrix and Event kernels) information for MRO is added to the EDR headers. For each CCD, we combine the two channel EDRs into a single image. The combined image is then normalized to remove both the striping and left/right normalization effects. This is not a necessary step for obtaining map project information but makes it easier to visually inspect the final combined mosaic. Once both CCDs have been reduced they are combined in a final mosaic accounting for the 48 pixel (in 1×1 binning) overlap. Once the single filter red mosaic is made, we are able to translate any fan and blotch pixel position to latitude and longitude on the south pole using ISIS-3’s campt application. The catalog tables P4_catalag_v1.0_L1C_cut_0.5_fan_meta_merged.csv — and _blotch_meta_ merged.csv respectively —, provided as supplemental files include the cluster coordinates as latitude/longitude derived from this process, as well as a set of positional coordinates (X,Y,Z) in the body-fixed reference frame for Mars, measured in kilometers. 6 http://isis.astrogeology.usgs.gov/ 7 https://github.com/wtolson/Pysis 29 hi2isis spiceinit cubenorm Create mosaic by merging 2 remaining center CCD images Stitch channels, creating 1 img per CCD histitch handmos Pr Remove striping and normalization probs Translate fan and blotch pixel positions to lat/lon campt n Ground coordinates cl ei Figure 18: Process for creating single channel non-map projected mosaics with the required SPICE header information used to convert Planet Four feature pixel coordinates to geographical lat/lon coordinates. The required ISIS-3 applications for each stage are listed in the arrows. 4.5. Overlap regions As previously mentioned in Section 2, to avoid edge effects, the cutting down of HiRISE images into screen-sized tiles is performed such that there is a 100-pixel overlap with the neighboring tiles. This way, at least in one of the tiles of an area fans and blotches that cross the boundary between tiles will be visible completely. However, from our own Planet Four marking efforts and from analyzing results from Planet Four volunteers, we have determined that the classification tools do provide such high level of precision in placement, that many volunteers position and push a fan or blotch marking out of bounds of the shown image area to make it fit a partially shown fan or blotch. This results in several markings for the same object stemming from different Planet Four tiles, as shown in Fig. 19. It can be seen in this figure that the directions of fans are matching, despite the fact that some tiles only showed a small part of a fan in the overlap area. We hence conclude that a wind direction analysis is not adversely affected by this analysis artefact. For a future study focusing on area covered by markings and counts of fan and blotch activity, we will implement a merging procedure to remove multiple markings, similar to the Combination step in our pipeline, as described in Section 4.3. rti A Add SPICE data es 4 images (2 center CCDs, 2 channels per CCD) 30 A es Pr n cl ei rti Figure 19: Six neighboring Planet Four tiles of HiRISE image ESP_011931_0945 are merged in this plot. The tiles have the following tile coordinates within the HiRISE image and Planet Four tile_ids: Upper Left: (1, 33),b1j; Upper Right: (2, 33), b10; Middle Left: (1, 34), b0p; Middle Right: (2, 34), b20; Lower Left: (1, 35), b0t; Lower Right: (2, 35), b0a (all 3 letter tile_ids need to prepend ‘APF0000’ for the full ID). The shape of the tiles are distorted compared to their displayed on-screen size for this plot. Each tile was clustered individually, indicated by the different marking colors. The solid lines indicate where an unshared division between the tiles would lie, the dashed lines show the overlap region that was added to each tile to maximize available information for the volunteers. This plot is instructive in showing how the marked fans, specifically their directions match very well, despite the fact that sometimes only a very small part of the whole fan marking was visible to the classifying volunteer. For increased precision in total marking counts and the area covered by markings we will design an object merging procedure on these overlap regions (next paper). 31 5. Data Validation cl ei n Pr es To date, there is no published catalog of the locations and numbers of seasonal defrosting features for any of the HiRISE images of the Martian south polar region to compare to the Planet Four results. In order to assess the accuracy and recall rate of Planet Four and confirm the majority of fans and blotches present in the HiRISE observations are identified when combining multiple classifier markings, we have created a ‘gold standard’ data-set based on expert assessment. Using the same classification interface and markings tools on the Planet Four website as the citizen scientists used, the Planet Four Science team reviewed a subsample of the Seasons 2 and 3 tiles and produced a catalog of markings. Similar validation processes have been applied in analyses of our previous Planet Four publication for the sister project Planet Four: Terrains [Schwamb et al., 2017a] and to crater counting crowd-sourced data for the Moon [Bugiolacchi et al., 2016; Robbins et al., 2014]. To generate the gold standard data-set, 960 Season 2 tiles and 767 Season 3 tiles were randomly selected and equally divided amongst the three of the primary Planet Four Science Team members (GP, KMA, MES) to review. This corresponds to 3 % of the tiles from each season classified on Planet Four. Additionally another 192 tiles, both from Season 2 and 3, were randomly chosen and classified by all science team gold standard classifiers in order to compare the science team markings to each other. This corresponds to approximately 0.4 % of each season’s tiles. The Planet Four tile_ids of the gold standard classifications and the user names of the science team members that did the analysis are provided in supplemental data files P4_catalog_v1.0_gold_standard_ ids.zip. Common Expert data vs Catalog # of tiles 102 101 100 0 10 20 30 40 50 # of fans+blotches per Planet Four tile rti A GP MES KMA catalog 60 70 Figure 20: Comparing counts of identified objects (i.e. fans and blotches together) per Planet Four tile between experts and the catalog data; here, for the 192 common tile_ids that were classified by all experts. Bin size is 5, each bin is directly compared between the data from all experts GP (blue), MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut off at 75, omitting single entry bins above. 5.1. Counts of objects identified We use the expert classifications from the science team with our final catalog in order to explore how well fan and blotch features are identified and how accurately the shapes and dimensions are represented in the Planet Four catalog. We show a tile-based comparison in Section Appendix F.1, 32 Expert vs Catalog object identification frequency GP catalog 101 100 0 25 50 75 125 101 100 0 25 50 75 100 125 150 175 KMA catalog n 102 101 cl ei # of tiles 175 Pr # of tiles 150 MES catalog 102 100 0 25 50 75 100 # of fans+blotches per Planet Four tile 125 150 175 Figure 21: Comparing counts of identified objects (i.e. fans and blotches together) per Planet Four tile between experts and the catalog data. Bin size is 5, each bin is directly compared between data from experts (in dark blue) and catalog data (in orange), with the experts GP, MES, and KMA respectively, from top to bottom. Each histogram contains data for 432 tiles, with each expert classifying an independent data-set. but first we examine the collective properties of the part of the Planet Four catalog that represents the gold standard tiles. We compare and contrast these distributions to the expert classifications together and per expert reviewer. Figure 20 compares the number distribution of identified sources (i.e. fans + blotches) per Planet Four tile between experts and the catalog data for the 192 common tiles that were commonly classified by all three science team members (KMA, GP, MES). Among the expert classifiers there are some visible differences especially where the interpretation of a single image or two dominates the value of the histogram bin. The final catalog is within the variance of the individual expert assessments. We can see this further in Figure 21 which shows the number distribution of identified objects (i.e. fans and blotches together) per Planet Four tile when comparing the results for the tiles that were only classified by one of the science team members. We note that even tiles with 30 or 40 fans and/or blotches are still well represented in the catalog. rti A 100 es # of tiles 102 33 5.2. Fan lengths and blotch areas Fan lengths, common expert data vs catalog 101 100 0 100 200 300 Fan lengths [pixel] 500 Figure 22: Comparing measured fan lengths between experts and the catalog data; here, for the 192 tile_ids that were classified by all experts. Bin size is 30, each bin is directly compared between the data from all experts GP (blue), MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut off at 600, ommiting single entry bins above. cl ei n We also use our expert gold standard classifications to examine the physical sizes and areal coverage of the Planet Four catalog fans and blotches (see Figures 22 to 25). As in previous comparisons, there is good agreement. The differences between the catalog is within the the variance seen between the individual expert classifiers. Differences between the catalog and experts become more apparent when in small number regimes (when <10 sources comprise the bin). These differences between the distributions in these small sizes is consistent with small number Poisson uncertainty on the histogram values [Kraft et al., 1991]. Thus, fan length and blotch areas are well reflected in the Planet Four catalog. rti A 400 Pr # of fans 102 es GP MES KMA catalog 34 101 100 100 300 102 101 cl ei 103 0 100 200 300 500 400 500 KMA catalog 102 101 100 0 100 200 300 Fan lengths [pixel] 400 500 Figure 23: Comparing measured fan lengths between experts and the catalog data. Bin size is 30, each bin is directly compared between the data from all experts GP (blue), MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut off at 600, ommiting single entry bins above. rti A 400 MES catalog 100 # of fans 200 n # of fans 0 Pr # of fans GP catalog 102 103 es Fans lengths, expert vs catalog 103 35 Blotch area, common expert data vs catalog GP MES KMA catalog 102 101 100 0 10000 30000 40000 Blotch area [pixel**2] 50000 60000 70000 Pr Figure 24: Comparing measured blotch areas between experts and the catalog data; here, for the 192 tile_ids that were classified by all experts. Bin size is 5000, each bin is directly compared between the data from all experts GP (blue), MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut off at 80,000, ommiting single entry bins above. GP catalog 103 102 101 100 20000 103 40000 60000 cl ei # of blotches 0 n # of blotches Blotch area, expert vs catalog 80000 100000 120000 MES catalog 102 101 100 # of blotches 0 20000 40000 60000 80000 100000 120000 KMA catalog 103 102 101 rti A 20000 es # of blotches 103 100 0 20000 40000 60000 Blotch area [pixel**2] 80000 100000 120000 Figure 25: Comparing measured blotch areas between experts and the catalog data. Bin size is 5000, each bin is directly compared between the data from all experts GP (blue), MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut off at 120,000, ommiting single entry bins above. 36 5.3. Wind direction comparison Histogram of deltas between science team and volunteer mean fan directions. 6 4 2 0 40 20 0 20 40 Delta mean wind direction per Planet Four tile 0 5 10 15 20 Fan angle standard deviation per cluster [deg] n Figure 26: Left: From the 192 tiles that were analyzed by the science team, 82 resulted in fan catalog entries. Of those, we used 39 that had more than 3 fans, for better statistics (the median number of fans per tile is 4, see Section 6). In this histogram, we show the difference between the mean angle of the fans in these 39 Planet Four tiles between the science team and the volunteers. Overall, we have a good agreement, with a few rare outliers, discussed in the text and in Figures 27 and 28. Bin size is 2. Right: Standard deviations (STDs) of the directions of fan markings that went into each cluster, before they are merged into the average resulting catalog object. This plot shows the distribution of these STDs for the set of 192 common gold tiles, which had a total amount of 904 fans. Bin size is 1. cl ei Fig. 26, left, shows a histogram over the differences in the mean-over-tile fan directions between the catalog entries that are clustered from all the volunteers’ markings and the average from the three science team members. In general, the agreement is very good, with differences usually smaller than 10 degrees. Another way to investigate our uncertainties is to calculate the angular standard deviation for each cluster member markings that are merged into the final catalog objects, independent on if the markings were done by an expert or a volunteer. Fig. 27 discusses the lower outlier of, indicating that the respective Planet Four tile has a more difficult than usual scenario with a naturally occurring higher variance of the actual deposit directions on the ground. Not only are the deposit shapes visible in the upper left more irregular than usual, there is a visible gradient of directions across this tile, as can be seen by the exaggerated fan pointers. This gradient is probably caused by the basin shapes in the Inca City region that can create a topographical control of the alignment of fan deposits over the usual wind control. However, our reduction pipeline is reliably reducing the markings for every deposit, but with higher than usual variance between orientation and size of the markings. Having no single clear fan direction in the image tile, it is reasonable to expect a higher variance and hence, a higher delta when compared to the 3 science team members. In a similar fashion, Fig. 28 discusses the high-side outlier of Fig. 26. While fans have been identified, their counts is low, creating low statistics effects by letting small deviations having a larger effect on the comparison with the catalog data. Additionally, the few fans that are visible appear to show different directions, leading to a less certain fan direction with a higher variance, which in turn can lead to larger differences when comparing their values, resulting from low statistics. rti A 140 120 100 80 60 40 20 0 Pr 8 Bin Counts Bin Counts 10 es Histogram of angular STD for merged fan clusters 37 es In Fig. 26, right, we plot the standard deviations for all 904 fan clusters for the 192 common tiles that were analyzed by all experts. The right end of this histogram is cut off by our angular clustering parameter of 20°, meaning larger angular differences are never clustered together. However, the majority of standard deviations lie far below that safety cut-off value for the clustering. We estimate an average uncertainty for our fan directions of about (5 ± 3)°, using a half maximum width of this histogram. The actual uncertainty highly depends on the quality of the data as given by the HiRISE binning mode and the local variability of winds, leading to increased diffusion of the deposits. We believe these factors lead to the non-Gaussian skew of the histogram. Additional validation results can be found in Appendix F. A rti cl ei n Pr 5.4. Summary In conclusion, our catalog has high completion in most cases. Outliers have been found to be caused by special circumstances with more challenging classification tasks, creating higher variance for all classifiers, including the experts. The analysis of the gold standard sample demonstrates that the bulk composition of the Planet Four catalog represents a fairly complete picture of the seasonal fans and blotches captured in the HiRISE images. 38 A es Pr rti cl ei n Figure 27: One of the outliers of Fig. 26, Planet Four tile ID APF00002aj of HiRISE image ESP_012744_0985. The input image shows deposit shapes with less pronounced boundaries, leaking into the background. There is also a visible gradient of directions across the tile (visible through the extended fan pointers). See the text for more interpretation. Figure 28: The highest outlier from Fig. 26, Planet Four tile ID APF0000c0t, from HiRISE image ESP_012858_ 0855. While fans have been identified, their number is small, increasing the chance for variance between the experts and catalog data. 39 6. Results: Fan and Blotch Catalog n Pr es From 221 HiRISE images from Mars years 29 and 30, cut up into 42,904 Planet Four tiles, the Planet Four volunteers produced almost 2.8 million fan markings, that were clustered into 159,558 fans in our MY29/MY30 catalog. In Table 4 we show an example of fan catalog data. For blotches, 3.46 million raw markings were combined into 250,164 blotches. 29.6 % of the image tiles (= 12,693) end up not having any clustered markings in our catalog. Fig. 29 shows the distribution of the fraction of empty tiles per HiRISE image vs. solar longitude. Visual checks of data with fractions above 0.8 confirmed that these HiRISE images are mostly free of CO2 jet deposits at spring times; in late summer, however, when the seasonal CO2 ice layer has fully sublimated,fan and blotch deposits are rendered mostly invisible, because they blend into the now ice-free background. A notable exception to this general effect is the ROI Inca City where the summer data, after Ls 250◦ –260◦ , regularly shows fan deposits still discernible. This could point to an interesting difference in the ground soil compactification and its related observed texture. New deposits from CO2 jet eruptions may be sufficiently different in texture from the background as a result from particle sorting and related phase function changes of the fresher surface. Distribution of empty tiles vs time 0.8 0.6 0.4 0.2 A rti Fraction of empty tiles per HiRISE image cl ei 1.0 0.0 180 200 220 240 260 Solar Longitude [ ] 280 300 Figure 29: Distribution of empty tiles over time, measured in Mars Solar Longitude. Until Ls =260° the fraction of HiRISE images that can be empty varies randomly, reflecting the different ground surfaces imaged across all latitudes. After Ls =260° all CO2 is gone — earlier at lower latitudes —, and most of the HiRISE images appear empty in terms of identifiable blotches or fans, because any deposits blend with the ice-free background. 40 A angle distance tile_id 205.56 185.39 184.98 184.29 189.42 194.16 187.74 209.47 199.91 218.88 179.71 179.62 500.27 105.43 109.50 335.78 183.41 179.29 220.64 118.16 2270.76 3391.21 3509.96 3716.27 3452.17 3565.47 3143.15 942.95 1199.11 815.95 24336.16 5640.60 5876.70 5824.50 6033.00 15930.34 15433.60 22257.99 21994.01 22539.28 41 -0.43 -0.09 -0.09 -0.07 -0.16 -0.24 -0.13 -0.49 -0.34 -0.62 35 15 10 6 3 64 20 58 54 42 spread version vote_ratio ESP_012079_0945 ESP_012079_0945 ESP_012079_0945 ESP_012079_0945 ESP_012079_0945 ESP_012079_0945 ESP_012079_0945 ESP_012079_0945 ESP_012079_0945 ESP_012079_0945 88.03 21.35 18.91 26.41 22.58 34.93 25.68 49.11 35.37 49.66 1 1 1 1 1 1 1 1 1 1 1.00 1.00 1.00 0.68 0.51 1.00 1.00 1.00 1.00 1.00 x x_angle 790.76 431.21 549.96 756.27 492.17 605.47 183.15 202.95 459.11 75.95 -0.90 -1.00 -1.00 -1.00 -0.99 -0.97 -0.99 -0.87 -0.94 -0.77 l_s north_azimuth map_scale BodyFixedCoordinateX BodyFixedCoordinateY BodyFixedCoordinateZ 214.785 214.785 214.785 214.785 214.785 214.785 214.785 214.785 214.785 214.785 126.856883 126.856883 126.856883 126.856883 126.856883 126.856883 126.856883 126.856883 126.856883 126.856883 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 -65.804336 -67.219114 -67.170611 -67.127761 -67.169940 -66.258570 -66.400170 -66.296391 -66.261274 -66.300167 261.407884 257.011589 257.055226 257.024926 257.096267 259.361039 259.284370 261.048812 260.965240 261.124709 in 224.16 160.60 396.70 344.50 553.00 586.34 89.60 337.99 74.01 619.28 F000000 F000001 F000002 F000004 F000005 F000006 F000007 F000008 F000009 F00000a e cl y y_angle 0 1 2 3 4 5 6 7 8 9 APF0000ci9 APF0000cia APF0000cia APF0000cia APF0000cia APF0000cib APF0000cib APF0000cid APF0000cid APF0000cid rti 0 1 2 3 4 5 6 7 8 9 image_x image_y marking_id n_votes obsid -3370.504345 -3370.631413 -3370.630794 -3370.635002 -3370.628302 -3370.571273 -3370.565666 -3370.492211 -3370.497183 -3370.487589 PlanetocentricLatitude PlanetographicLatitude PositiveEast360Longitude -85.427383 -85.493546 -85.493039 -85.493723 -85.492368 -85.459101 -85.459755 -85.431209 -85.432730 -85.429945 -85.480830 -85.546226 -85.545725 -85.546401 -85.545061 -85.512180 -85.512827 -85.484612 -85.486115 -85.483362 104.129523 104.656897 104.644396 104.637107 104.642019 104.330752 104.364183 104.249678 104.246813 104.246483 Pr 0 1 2 3 4 5 6 7 8 9 es Table 4: First ten lines of the fan catalog file P4_catalog_v1.0_L1C_cut_0.5_fan_meta_merged.csv, broken into three segments. es Pr n cl ei rti A Figure 30: Planet Four tile APF00006mr from HiRISE image ESP_011296_0975 has the highest number of resulting fan entries per tile. Top: Input tile as seen by volunteers; Bottom: Overlaid clustering results from the catalog. 42 es Pr n cl ei rti A Figure 31: Planet Four tile APF00007t9 from HiRISE image ESP_012604_0965 has the highest number of resulting blotch entries per tile. Top: Input tile as seen by volunteers; Bottom: Overlaid clustering results from the catalog. 43 es 6.1. Catalog properties 6.1.1. Fan counts The highest counts of fans and blotches were 167 fans in the tile_id APF00006mr and 278 blotches in the tile_id APF00007t9, shown in Figures 30 and 31. These data serve as an indication of the dedication of the Planet Four volunteers producing results in such high spatial density. The median count of fans and blotches per tile is 4. The distribution of both numbers is shown in Fig. 32. A rti cl ei n Pr 6.1.2. Fan lengths As an example of the possibilities of the produced catalog, we describe the measured fan lengths in the catalog. The catalog column distance requires scaling by the values in map_scale, to correct for the different HiRISE binning modes. The distribution of these measurements are shown in Fig. 33. About 97 % of all fans are below 100 m in length, with a median value of 24 m. The three largest fans measured are all from the same ROI called Manhattan Classic (Lat −86.39°, Lon 99°), having lengths of 373 m, 368 m and 361 m respectively. They were identified in the HiRISE images ESP_013095_0935 (longest) and ESP_011961_0935 (second and third). The two longest fan markings even identify the same fan, but at different times in the season, with the longest observed at Ls =265°, and it’s shorter self at Ls =209°. Being only 5 m different, we attribute the increased marking measure to both material being potentially moved around by winds during spring and a decrease of precision in identification after the CO2 has sublimed and the deposits start to fade into the background. However, we interpret the fact to have identified the largest fan twice, as a further indication of the high reliability of our results, considering that the random image serving procedure of the Planet Four classification interface ensured that volunteers do not classify images in the order they have been taken, because that would have increased the chances of being biased by their previous classification. In this case, where 119 volunteers classified APF0000dtk with the longest fan, and 54 volunteers classified APF0000de3 with the second longest fan (shown in Fig.14), only one volunteer was identified to be the same. An overview of the fan lengths distributions for all major ROIs over all 2 Martian years of data is shown in Fig. 34. When compared between Mars years 29 and 30, the total (over all ROIs) fan length statistics are very comparable, with a median of 24.2 m for MY29 and 23.8 m for MY30. However, we identify specific ROIs that have different fan properties between MY29 and 30. For example, the ROI Manhattan has a median fan length of 42 m in MY29 and a decreased median of 25 m in MY30. This is in contrast with the ROI Giza, where the trend has the opposite direction, with a median fan length of 44 m in MY29, comparable with Manhattan’s median in the same year, but then increases to a median of 59 m for MY30. Meanwhile, in ROI Ithaca, both years are very similar, with median fan lengths of 39.5 m and 38.6 m respectively. 44 4 10 3 10 2 10 1 10 0 0 50 100 200 250 0 50 100 150 Markings per tile 200 250 cl ei n Figure 32: Count distributions of catalog objects per tile, blotches on the left, fans on the right. The bin size is 5 counts in both plots. Cumulative normalized histogram of fan lengths 0.97 10 2 10 3 10 4 10 5 10 Fraction of fans with given length Normalized Log-Histogram of fan lengths rti A 150 Markings per tile Pr 10 marking = fan es marking = blotch 6 0 50 100 150 200 Fan length [m] 250 300 1.0 0.6 0.4 0.2 0.0 350 0.82 0.8 0 50 100 150 200 Fan length [m] 250 300 350 Figure 33: Normalized histograms of all fan lengths in the catalog. Left: Log-Histogram, Right: Cumulative Histogram. The median (i.e. fraction of 0.5) value is at 24 m, with 82 % of the fans shorter than 50 and 97 % shorter than 100 m, as indicated by the lines in the plot. 45 Pr Maccelsfield Starburst Manhattan Bilbao Portsmouth n region Ithaca Manhattan_ Frontinella cl ei BuenosAires Inca Giza Potsdam Oswego_Edge 0 50 100 150 200 distance_m 250 300 350 Figure 34: Boxplot showing the distributions of fan lengths for our set of regions of interest at the Martian south pole, over both Martian years of data, MY 29 and 30 (Inca City and Inca City Ridges are combined here due to their proximity). The boxplot setup uses the standard setup of interquartile range (IQR) for the box and its whiskers extending to 1.5xIQR, single dots for outliers. rti A es Fan lengths in different ROIs 46 7. Wind Direction Results from Four Sample Regions of Interest (ROIs) Pr es Early in the mission, HiRISE has defined several regions of interest (ROIs) within the southern polar areas that have been extensively monitored for seasonal activity ever since (the list of original seasonal ROIs can be found in Hansen et al. [2010]). We have selected a sub-set of these ROIs to be analyzed by Planet Four, as shown in Table 1). The map of ROIs’ distribution over the pole is shown in Fig. 4. Below we will focus on 4 example ROIs to showcase the use of Planet Four data catalog and our ability to monitor wind directions using fan markings positions and locations. We have picked these 4 ROIs (informally named Ithaca, Giza, Manhattan, and Inca City) for regional case studies of the seasonal winds because the temporal coverage over these locations is the highest. We describe each ROIs’ general settings and geomorphology based on observations of HiRISE and our previous works [Hansen et al., 2010; Pommerol et al., 2011]. We then present the wind direction maps over spring season at each of these locations. The wind rose diagrams for each HiRISE image separately are available in the supplementary files P4_catalog_v1.0_wind_rose_diagrams. pdf A rti cl ei n 7.1. Ithaca The Ithaca region is located at southern latitude 85.2◦ , eastern longitude 181.4◦ . This location is away from the permanent polar cap, at the edge of the cryptic region and situated on a surface that is relatively smooth on a large scale: the digital terrain model produced by HiRISE (DTEPD_040189_0950_040216_0950_A01) shows vertical elevation variations less than 60 m across the Ithaca region. At the same time, on the meter scale the surface in Ithaca is rough, showing irregular and uneven bumps and pits. No araneiforms (i.e. radially-organized channels) were detected Ithaca according to HiRISE imaging, while rare isolated troughs and patterned ground similar to araneiform troughs are present [Hansen et al., 2010]. During local spring, fan-shaped deposits densely cover the Ithaca region (see an example in Fig. 35. Opening angles and lengths of the fans were reported to evolve during spring while the nature of these changes was not quantified [Thomas et al., 2010]. Multiple fans were observed to emerge from the common vents, at times merging together to create a wider singular fan. The directions of the fan deposits were noted to be consistent from one Martian year to another with only little variation. An interesting detail about Ithaca is very prominent bluish halos and fans that are repeatedly observed here [Thomas et al., 2010]. In contrast to the more common dark fan-shaped deposits, these halos and fans have higher albedos, approaching the albedo of fresh ice deposits. In Ithaca they are also distinctively bluer than the rest of the surface. There are at least two types of such bright deposits. One type resembles narrow fans that are located centrally over the older dark fans. These appear early in spring, before Ls = 190°. The other type resembles halos contouring the pre-existing dark fans. They appear on average later than the narrow bright fans. In summer (Ls > 270°), the seasonal deposits are mostly invisible in Ithaca. Partially, this is because the low scale roughness creates a patchy-looking environment with pits being darker than bumps either due to shadows or dust collecting in depressions. 47 A rti cl ei n Pr es Fig. 35 shows a typical plot that we will use to analyze derived wind directions in our ROIs. This particular plot was created from Planet Four data for one HiRISE image (ESP_011931_0945) taken in Ithaca at Ls = 207°. To create this plot we took all the fan markings over the HiRISE image and plotted it as a histogram of their directions (top right panel of Fig. 35). Note that, in contrast to the standard wind rose diagrams showing the directions of the origin of winds, we use this diagram to show the measured deposition directions caused by the winds, i.e. the opposite from the wind origins. We decided for this kind of display because it relates more to the actual measurements performed by the Planet Four project and does not imply any interpretation. The fan direction is counted clock-wise (CW) from the North Azimuth (NA) direction, where 0° always represents North, and 270° West. The histogram is not scaled, i.e. the y-axis shows the actual counts of the fan markings with the direction of each bin in the x-axis. The maximum of the histogram is the most probable direction for the markings and the width indicates how variable the directions of the markings are for this particular Ls . The default size of each histogram bin is 3.6 ◦ . In exceptionally rare cases for a particular image the number of fan markings and thus number of wind measurements are low. Such cases require special treatment and increase in bin size. On the top left panel the same data are plotted in the wind rose diagram. This time the histogram is normalized to highlight the difference in directions if several HiRISE images are plotted in the same frame. Note that the position of zero (NA direction) depends on the location of ROI, i.e. the wind rose diagram is map-projected to the location of the data plotted. Thus, the direction of the fans can be directly compared to the map-projected HiRISE image (bottom panel). In this particular example one can see that the histogram has 2 peaks that indicate there are two distinct directions of the fans. This can either be (1) because of overlapping fan deposits from jets that erupted from the same vents at different times prior to Ls =207.8◦ under different wind regimes; or (2) because different areas of the ROI have distinctively different wind regimes. In this example comparing the derived fan directions to the sub-frame of the HiRISE image indicates that the first case is more probable. Fig. 36 shows directions of the fan deposits in Ithaca as retrieved by the Planet Four project for two Martian years: MY29 and MY30. We have separated the spring season into early spring, i.e. before Ls =210◦ , and late spring, from Ls =210◦ to Ls =270◦ . The panels in this figure are organized in the way that columns show separation into early and late spring while rows show MY 29 and MY 30. Ithaca fans sustain the same direction towards ≈125° through the whole spring in both years with only a little shift towards East (see also top left panel of Fig. 40). In MY29 fan direction histograms are wider than in MY30. A narrow histogram is an indication of small deviations of the governing winds at the times of jet eruptions. The shift of the mean wind direction is less than 10◦ in the early spring and the maximum shift is 25◦ over the whole season in MY29. Histograms widen with increase of Ls and sometimes develop double maxima indicating more variability in the marked fan directions. This is also reflected in the increase of the standard deviation towards the end of spring. It can be attributed to larger wind variability later in spring or that winds become strong enough to lift the particles from the ground at times between jet eruptions. Over-all MY30 show similar behavior to MY29. 48 es Pr n cl ei rti A Figure 35: Fan directions in Ithaca region at Ls =207.8◦ (top) and a subframe of HiRISE image ESP_011931_0945 that can be directly compared to the wind rose diagram in the top left panel. 49 es Pr n cl ei Figure 36: Direction of fan markings in Ithaca region for early and late spring of MY29 and MY30. A rti 7.2. Giza The Giza region is at southern latitude 84.8◦ , eastern longitude 65.7◦ . It is located closer to the edge of the permanent cap than Ithaca. It is also near a trough with exposure of southern polar layered deposits while the area of Giza is flat on km-scale (see HiRISE DTM DTEPC_004736_ 0950_005119_0950_A01). On the smaller scales, as can be seen in multiple HiRISE images taken over this area (including those that were input for Planet Four) the region is covered in modulated bumps and small ripples. One side of this ROI is covered in yardangs. Very large and very intricate araneiform structures are located in this region. Their troughs are narrow, long, with high degrees of branching. These araneiforms are very active in spring: multiple long and narrow fans emerge from their troughs and cover an extended area. HiRISE detected a dusty reddish haze over the araneiforms in Giza in several years indicating active loading of dust into the lower layer of atmosphere. The directions of the fans in the late spring were previously noted to co-align with yardangs, suggesting that the wind regime in this area in summer stayed stable for an extended period of time Hansen et al. [2010]. Similar to Ithaca, in Giza we do not observe significant differences in fan directions between MY29 and MY30 (Fig. 40 lower left panel and Fig. 37). Early images taken before Ls =190◦ show very narrow histograms with a maximum between 300◦ and 310◦ . The maximum, which marks the direction of most fans, slowly shifts towards 360◦ . The shift rate is higher than in Ithaca (> 45◦ 50 es Pr n cl ei Figure 37: Direction of fan markings in Giza region for early and late spring of MY29 and MY30. HiRISE images used: ESP_011447_0950, ESP_011448_0950, ESP_011777_0950, ESP_011843_0950, ESP_012212_0950, ESP_012265_0950, ESP_012344_0950, ESP_012704_0850, ESP_012753_0950, ESP_012836_0850, ESP_012845_0950, ESP_020150_0950, ESP_020401_0950, ESP_020480_0950, ESP_020783_0950, ESP_020902_0950, ESP_021482_0950, ESP_022273_0950. over the whole spring). The number statistics of fan detection worsens in the late spring in both years, but it is particularly noticeable in late spring of MY30 (see histograms for the late spring of MY30). This is explained by decreasing contrast between the fan deposits and undisturbed surface around fans in late spring images, i.e. the fans blending in with their environment. A rti 7.3. Manhattan The Manhattan region is in a very active area with at least 3 HiRISE ROIs that once were all considered under this same name. This area is around southern latitude 86◦ , eastern longitude 99◦ , as the two above, this is on the edge but still inside the cryptic region. The ROI is located on the eastern side of a South Polar Layered Deposit (SPLD) trough that in spring is completely covered with seasonal activity. The area is inclined towards the trough, i.e. in the north-west direction, however, rather insignificantly. According to the HiRISE DTM (DTEPC_022259_0935_022339_ 0935_A01), there is a 270 m elevation change over approximately 8 km (≈2° slope). Manhattan is covered in well developed interlaced araneiforms. Similar to Giza, the araneiforms here have thin and long troughs and branch significantly. Aside from araneiforms, the surface in Manhattan is smooth, even on tens to hundreds meters scales with just several exceptions of shallow irregular pits. Seasonal activity is extensive in Manhattan, with dark fan deposits that at times develop bright 51 es Pr n Figure 38: Direction of fan markings in Manhattan region for early and late spring of MY29 and MY30. cl ei halos. Intriguingly, araneiforms’ troughs become visibly brighter compared to the rest of surface around Ls = 200◦ and stay bright almost until the region completely defrosts. Fans in Manhattan are directed 230◦ from NA direction in the beginning of spring as shown by the first observations of both analyzed years. This direction shifts during early spring and plateaus at 290◦ after Ls =220◦ . A rti 7.4. Inca City Inca City is at latitude 81.3◦ , eastern longitude 295.7◦ ; relative to the aforementioned ROIs it is on the opposite side of the permanent cap and the southern pole. The topography of this location is the most complex in our list (HiRISE DTM DTEPC_022699_0985_022607_0985_A01). It is a system of over 300 m-high ridges that crisscross each other at almost right angles forming close-torectangular basins. The slopes of the ridges sometimes exceed 13◦ providing a variety of insolation environments in a relatively small region. The inner surface of the basins is flat and most of araneiforms of Inca City are carved in it. The formation of the Inca City ridge system is debated but most commonly attributed to the interaction of irregularities of the local crust with an impactinduced compaction wave [Kerber et al., 2017]. Araneiforms in Inca City are morphologically different from those in Giza and Manhattan. They have a well-developed central depression with relatively short troughs extending outwards and are on average smaller. Seasonal activity in Inca City starts at the slopes of the ridges [Thomas et al., 2010]. Fan deposits extend downwards following gravity lines. The fans are very narrow but do not have any 52 A rti cl ei n Pr es features of the flows (dark flows come later in spring). It is not fully clear if the fans are directed by the gravity or by downslope winds in this ROI. The surface around and near araneiforms, in the basin floor, gets covered mostly in blotches suggesting that no significant winds are active inside the basins. Directions of fans in Inca City are seemingly disordered, particularly in comparison to the 3 ROIs discussed above. However, Inca City is special in this set because it has prominent topography that the other 3 ROIs lack. Thus the analysis method that works well for our other ROIs might not be applicable to Inca City. Inca City ridges affect the local deposition of solar energy and influence near-surface winds. Directions of fans in Ithaca, Giza, and Manhattan are modified by near surface winds that normally pass undisturbed over the whole ROI. In contrast, in Inca City fans are observed almost exclusively on the slopes of the ridges and are aligned with down-slope direction. However, these fans appear on the slopes gradually through spring: the first fans according to our analysis are pointing to the south-west direction (270◦ from NA), i.e. located on south-west facing slopes. Early observations have the smallest standard deviation indicating smallest variation in the fan directions (Fig. 39). However, even in the early histograms several local maxima may be detected. The location of the secondary maxima are determined by the slopes that were covered by HiRISE image at each Ls . Later in spring the fans start to appear on the slopes with a different orientation than to the south-west. This widens the histogram for each HiRISE image and makes the location of the histogram maximum a less and less relevant measure of the mean fan direction. This results in the larger variation of the mean fan direction and large standard deviations (bottom right panel of Fig. 40). Local maxima repeatedly occurring at the same directions from image to image in late spring and the whole scenario repeats in both years with only small variations. 53 A es Pr n cl ei rti Figure 39: Direction of fan markings in Inca City region for early and late spring of MY29 and MY30. 54 A es Pr n cl ei rti Figure 40: Direction of fan markings in 4 ROIs vs Ls for MY29 and MY30. Directions are plotted in degrees relative to NA direction. Error bars represent the standard deviation of the data and not the error on the mean. Prevailing winds control direction of fans in Ithaca, Manhattan, and Giza because the over-all topography in these ROIs is smooth and has no obstactles significantly modifying the winds. In Inca City, however, the topography is more prominent, with 3 km-high ridges that break down the general winds and support creation of katabatic flows. Thus, the fans here follow slopes of the ridges rather than wind direction, which is reflected in the large scatter of mean fan direction and large standard deviations on mean fan direction. 55 8. Conclusions cl ei n Pr es The Planet Four project has produced a catalog of 159,558 fans and 250,164 blotches (ellipses), identifying locations of seasonal surface deposits produced by the CO2 jet processes occurring during spring in the Martian south polar region. The catalog was generated by combining the assessments made by Planet Four volunteers reviewing a set of 42,904 tiles derived from 221 HiRISE observations obtained over 2 Martian Years, covering a set of 28 regions of interest (ROI) across the south pole. To date, this catalog serves as the largest reporting of locations, sizes, and mapping of seasonal deposits on the Martian surface. The Planet Four fan and blotch catalog constitutes a resource for studying polar winds, climate and polar processes. Using south polar fans as regional wind markers, the Planet Four catalog can provide tests for and input to global and regional atmospheric circulation models. Statistical comparisons between classifications produced by the science team and catalog results for the same image data (Section 5) demonstrate that the bulk composition of the Planet Four catalog represents a fairly complete picture of the seasonal fans and blotches captured in the HiRISE images. Trend consistency for fan directions between Mars Year 29 and 30, despite the fact that most data is being analyzed by different volunteers, further indicates reliability of the methods presented here (see summary Figure 40). We have gone into considerable detail on the methodology behind the data in the catalog and are confident that its content can be productively used by our colleagues for their own research. For 4 of the 28 ROIs we have presented mean fan directions. In three of these, the fan deposits appear to be directly modified by near-surface winds at the time of jet eruption; the fourth ROI shows the strong influence of topography. In ROIs Ithaca, Giza, and Manhattan: The derived mean winds show no significant inter-annual variability between MY29 and MY30: their direction at the same Ls are the same with less than 10° variations. In Inca City: The mean direction of the fans coincides with the direction of slopes and changes over spring while more slopes become exposed to sunlight and cold jet eruptions happen. Our analysis in this paper focused on HiRISE observations from seasons 2 (MY29) and 3 (MY30) of the HiRISE southern seasonal processes campaign, and research into inter-annual variability starts to be feasible. However, the HiRISE campaign covers now 6 seasons of monitoring, and for a number of selected ROIs 5 of these have been or are being analyzed by the Planet Four project at the time of writing. The results from the analysis of these longer timespans and additional areal coverage will be topics of future publications and data releases. A rti Acknowledgements The data presented in this paper are the result of the efforts of the Planet Four volunteers, generously donating their time to the Planet Four project, and without whom this work would not have been possible. Their contributions are individually acknowledged at http://www.planetfour. org/authors. Additionally we thank all those involved in BBC Stargazing Live 2013. This publication uses data generated via the Zooniverse.org platform, development of which was supported by the Alfred P. Sloan Foundation. The authors also thank Chris Lintott (University of Oxford), who had to decline authorship on this Paper. We thank him for his efforts contributing to the development of the Planet Four website and for his useful discussions. 56 A rti cl ei n Pr es MES is currently supported by Gemini Observatory, which is operated by the Association of Universities for Research in Astronomy, Inc., on behalf of the international Gemini partnership of Argentina, Brazil, Canada, Chile, and the United States of America. MES was also supported in part by an Academia Sinica Postdoctoral Fellowship and by a National Science Foundation (NSF) Astronomy and Astrophysics Postdoctoral Fellowship under award AST-1003258. CM was supported by the 2014 Institute of Astronomy and Astrophysics, Academia Sinica (ASIAA) Summer Student Program. KMA and MES also thank the attendees of the Workshop on Citizen Science in Astronomy for the insightful conversations and acknowledge ASIAA and Taiwan’s Ministry of Science and Technology (MOST) for supporting the workshop. The authors also thank Greg Hines, Cliff Johnson, Margaret Kosmala, Chris Schaller, Brooke Simmons, and Ali Swanson for insightful discussions. This work is also partially enabled by the National Aeronautics and Space Administration (NASA) support for the Mars Reconnaisance Orbiter (MRO) High Resolution Imaging Science Experiment (HiRISE) team. This paper includes data collected by the MRO spacecraft and the HiRISE camera, and we gratefully acknowledge the entire MRO mission and HiRISE teams’ efforts in obtaining and providing the images used in this analysis. The Mars Reconnaissance Orbiter mission is operated at the Jet Propulsion Laboratory, California Institute of Technology, under contracts with NASA. The authors also thank Rod Heyd for guidance in extracting the geographic and location information for HiRISE non-map projected image. This research has made use of the USGS Integrated Software for Imagers and Spectrometers (ISIS) and of NASA’s Astrophysics Data System. KMA and GP were supported for this work by NASA ROSES Solar System Workings grant NNX15AH36G. All software created for the pipeline is based on the open source language Python, using the matplotlib library [Hunter, 2007] for plotting, the pandas library for data wrangling and analysis [McKinney, 2010], the scikit-learn library [Pedregosa et al., 2011] for the clustering of Planet Four markings and other pre- and post-processing tasks, the IPython and Jupyter system for everday computing [Perez and Granger, 2007], and the SciPy tools on a daily basis [Jones et al., 2001]. 57 Appendix A. The Zooniverse’s Ouroboros Web Plateform A rti cl ei n Pr es In this Section, we briefly describe the Zooniverse’s Ouroboros web platform and describe how it interacts with the Planet Four classification interface. The Planet Four website and the Ouroboros platform are both hosted on Amazon Web Services. This enables the ability to rapidly scale up the number of servers based on the demand on the site, including handling the large number of classifiers during Stargazing Live 2013. The Planet Four classification interface is a JavaScript and coffee script application that presents the classifier with the HiRISE tile and enables the volunteer to draw markers on the image and submit them for storage in the Planet Four classification database. The Zooniverse’s Ouroboros platform, written in Ruby on Rails, handles the back end storage of classifications in a Mongo database and determines the next tile that should be sent to a given Planet Four classifier for review. Active tiles are shown to 30–100 classifiers before being retired from rotation. Once a classification is complete, the Planet Four interface sends the information via the Ouroboros Application Programming Interface (API) to be stored in the database and to update the classification count for the respective tile. If the activity on the website is low, this step is done immediately. If site traffic is high, for example 70,000 people on the website at once (such as during launch of the project), Ouroboros is designed to queue the classifications and store them asynchronously to the database so as not to impair the speed and performance of the Planet Four website. In this case the classification counts for the tiles and the list of tiles a registered or non-registered classifier has seen is not updated in live time The Planet Four web interface queries the Ouroboros API to identify the next tile to present to a classifier. Ouroboros checks the database and selects a random active tile that has not been previously reviewed by the Zooniverse registered user or non-logged-in session. At any given time, Ouroboros readies a list of 5 tiles that the classifier has not seen. When presented with a request to see another image, the next in this list is sent back by the API. Typically this means the classifier rarely if ever is presented with a tile to review twice. We note there was a bug in Ouroboros at launch that made repeats more prevalent. In Appendix B we describe our methods to cleanse duplicate and spurious classifications from our final data reduction. We note that in the Zooniverse’s Ouroboros framework, refreshing the Planet Four interface in the browser will result in a new tile being selected and displayed without updating the classification database. Refreshing the browser is just as easy as hitting the ‘Finished’ and ‘Next’ buttons to move on to a new image, so we do not believe this has any significant impact on classifier behavior. We mention it for completeness only. Also for the majority of the Season 2 and Season 3 classifications, a memory leak in the drawing library would cause the web browser to crash after a rather large number of fans were drawn in the image (approximately over 30–50 sources). This impacted a very small fraction of tiles. Appendix B. Handling of Duplicate or Spurious Classifications With the Zooniverse Ouroboros queuing system (described in Appendix A), it is possible that a duplicate classification may occur, but these instances should be rare. A software bug in the Ouroboros platform caused a number of classifiers to receive the same cutout they had previously 58 cl ei n Pr es classified before. Duplicate classifications are only a small portion of the data-set, comprising 1.9 % percent of all classifications produced, and typically, a few classifications or less per Planet Four image tile were duplicates in those cases. In order to treat each classification as an independent assessment, we removed all duplicate classifications, keeping only the first response for a given registered user/non-logged-in session for a given cutout. We also found a concentration of markings positioned at the top left corner (x=0, y=0) of the marking interface, with nearly all having default values for the other recorded parameters. Only 0.12 % of the 9,631,517 markings recorded for Seasons 2 and 3 are effected. Further investigation shows that less than 7 % of fan and blotch markings with default parameters with x=0 or y=0 are not centered at the origin. Thus, we believe these origin default-valued markings are due to a javascript error. Therefore, we simply delete them from the database, but keep any other markings associated with those effected classifications. Additionally 33 markings (∼0.003 % of all entries in the Planet Four classification database) do not have all of the required parameters that should have been recorded. We believe this is to due a singularity in the drawing tool for that marker, and we remove that entry from the database. There are also positions in the database recorded for a handful of fans and blotches significantly out of the bounds of the user interface. A classifier can move a marker drawn outside the edge of the image, to better capture the center position of a feature, but these positions are well outside the image region. This represents well less than 1 % of all classifications, and we have removed them from the analysis presented here. All statistics and values reported in this Paper are after the filtering described above. Appendix C. Raw Classification Data A rti Here we provide additional details about the raw classification data provided in the online supplementary data file8 . It is written in the binary HDF5 format, in the variant produced by the pandas library (supported by the PyTables library9 ). The general structure is as follows: Each classification submission by an individual volunteer creates a classification_id. All objects created by this volunteer receives the same classification_id, with the marking data for each object being one entry in the classification database. Each data row also has a marking column that identifies if this data is for a fan, a blotch, an interesting feature that will have the string value “interesting” in the marking column, or “none”, when the volunteer did not create any marking object. Below we describe the columns available in this database: Column name Example value classification_id 50ecaaf760d4050d21000414 Unique ID for each classification by a Planet Four volunteer 2013-01-08 23:25:43 time of submission APF0000p9t Planet Four tile identifier created_at tile_id Description 8 2018-02-11_planet_four_classifications_queryable_cleaned_seasons2and3.h5 9 http://pandas.pydata.org/pandas-docs/stable/io.html#hdf5-pytables 59 x_tile y_tile cl ei acquisition_date local_mars_time x y image_x image_y rti A es marking Pr user_name ESP_021491_0950 HiRISE observation identifier http://www. URL to image data for this Planet planetfour.org/ Four tile subjects/standard/ 50e741555e2ed211dc002346. jpg abc Originally, the Zooniverse username or non-logged-in session ID. For privacy concerns, we have converted these to anonymous IDs. blotch identifier for what data in row is for: blotch, fan, interesting, none 1 x coordinate of tile inside larger HiRISE image frame. Starts at 1 in upper left of the HiRISE image, increases to the right. 2 y coordinate of tile inside larger HiRISE image frame. Starts at 1 in upper left of the HiRISE image and increase downwards. 2011-01-01 00:00:00 date only for HiRISE observation time (ignore hours) 5:43 PM local mars time for given acquisition date 553.65 x pixel coordinate of object in Planet Four tile. Starts at 0 in upper left, increases to the right. 355.817 y pixel coordinate of object in Planet Four tile. Starts at 0 in upper left, increases downwards. 2033.65 x pixel coordinate of object in original HiRISE image. Starts at 0 in upper left, increases to the right. 37071.8 y pixel coordinate of object in original HiRISE image. Starts at 0 in upper left, increasing downwards. 295.195 Semi-major axis of blotch object in pixels. NAN if not applicable (N/A) 294.715 Semi-minor axis of blotch object in pixels. NAN if N/A NaN Length of fan object in pixels. NAN if data row is for blotch or interesting n image_name tile_url radius_1 radius_2 distance 60 spread NaN version NaN x_angle 0.887549 y_angle 0.460713 Orientation of marking object with respect to tile image x-axis in degrees. Positiv clock-wise, zero to image right (same definition as HiRISE) Opening angle of fan objects in degrees. NAN if N/A version of tool used to create fan. NAN if N/A cartesian x coordinate of angle column on unit circle cartesian y coordinate of angle column on unit circle es 27.4331 Pr angle cl ei n The Planet Four classification interface recorded a different angle than the intended spread angle from the fan marking tool. This was identified and subsequently fixed in the software. The correct spread angle is recoverable from the values stored in the database. We denote those markings generated before the patch with version flag set to 1.0 and those after with the version flag set to 2.0. We provide the corrected spread angle for the fans affected, but leave that version flag in the final catalog, for reference. To gather statistics on the understanding of the tutorial, the Planet Four classification database contains all the tutorial markings, indicated by a HiRISE image name of ‘tutorial’. For the delivered raw classification database, the fan angles range has been converted from -180–180 to 0–360, while the range of the blotch angles have been converted to 0–180, due to their rotational symmetry. Appendix D. Pipeline outputs The intermediate stages of the pipeline, as output by our clustering and combination pipeline are identified with different level identifiers 1A, 1B, and 1C, indicating different stages of the processing pipeline, where the processing is done on a per-tile-id level. After this is done, the final step of combines all the data from the ten-thousands of tile_id folders into a set of summarizing CSV files. Appendix D.1. Directory file structure The directory file structure of the pipeline products are as follows (examples in parentheses): A rti • HiRISE observation ID (ESP_011350_0945) – Planet Four tile ID (APF0000any) * Level 1A (L1A/APF0000any_L1A_fans.csv) * Level 1B (L1B/APF0000any_L1B_fnotches.csv) * Level 1C with cut value 0.5 in directory name (L1C_cut_0.5/APF0000any_L1C_ cut_0.5_blotches.csv) with the list of HiRISE observation IDs identifying the HiRISE observations that went into Planet Four for this database. 61 x_tile y_tile y image_x 2.0 26.0 123.611111 455.666667 863.611111 14155.666667 2.0 26.0 157.000000 391.800000 897.000000 14091.800000 distance angle spread version image_name NaN NaN NaN NaN y_angle n_votes image_id 1.0 -0.691035 -0.660663 1.0 -0.360802 -0.927999 cl ei 0 81.884266 223.712817 71.559689 1 57.742472 248.754137 52.521798 x_angle 9 APF0000any 10 APF0000any marking_id 0 ESP_011350_0945 F006de3 1 ESP_011350_0945 F006de4 Additionally, each L1A folder contains a text file called clustering_setttings.yaml that summarizes the clustering settings used for these data for reference. epsilon values are static and all the same, but the min_samples value is dynamically calculated, see Section 4.2.1 for details. Appendix D.2.2. Level 1B At level 1B, the combination pipeline has determined with objects are so close to each other that they should be considered for merging (see Section 4.3). The outputs are between one and three files this time. One only, in case all fans and blotches found were so close that they need to be evaluated by their classification votes. Usually, though, there are two to three files, where one files stores the objects that need voting, and the other file(s) store the objects that don’t have any close neighbors and will simply be copied over to the final level later. The fans and blotches in these latter files will receive the ‘vote_ratio’ value of 1.0, indicating that they had a “perfect” probability for being a fan, or blotch, respectively. The third file that keeps the close objects for the later thresholding contains these temporary meta-objects in sets of 2 rows, one fan and one blotch, and has the term “fnotch” in its filename (fnotches: FaN–blOTCH). This file contains all the clustering statistics data from L1A required to make a cut decision for L1C, with the data for rti A image_y radius_1 radius_2 n 0 1 x Pr es Appendix D.2. Pipeline stage levels Appendix D.2.1. Level 1A Level 1A is the data that is directly output from clustering and averaging the cluster members into average markings, as described in Section 4.2. Here, the biggest reduction in terms of numbers of objects in the system occurs, as all the different volunteers data are being combined into one object when the clustering process has determined the markings to be part of one cluster. All newly created average fans and blotches are summarized into one fan and blotch summary file respectively, which each line representing the mean object from averaging all cluster members. As an example, the content of APF0000p3q_L1A_fans.csv is shown below. When the column name matches those given in Appendix Appendix C, they have the same meaning. The two new columns are n_votes, which records how many members the cluster had that was used to produce this averaged object, and marking_id, which have been created at this stage of the pipeline and serve as a tracer throughout the different pipeline outputs: 62 each meta-object being sorted in alternating rows. Here are the first four rows of the fnotch file APF0000any_L1B_fnotches.csv: image_y 223.712817 81.884266 APF0000any ESP_011350_0945 863.611111 67.261720 NaN APF0000any ESP_011350_0945 838.395834 247.146845 58.742330 APF0000any ESP_011350_0945 832.000000 70.684606 NaN APF0000any ESP_011350_0945 821.666667 14155.666667 14123.875000 14306.400000 14281.428571 fan blotch fan blotch fan blotch fan blotch F006de3 B0071f2 F006de5 B0071ed radius_1 radius_2 9 NaN NaN 71.559689 8 49.309277 36.981958 NaN 5 NaN NaN 81.171448 7 35.324591 26.493443 NaN x_tile y 2.0 2.0 2.0 2.0 455.666667 423.875000 606.400000 581.428571 y_angle y_tile vote_ratio -0.660663 0.907431 -0.919245 0.852341 26.0 26.0 26.0 26.0 0.539412 0.460588 0.426667 0.573333 x x_angle 1.0 NaN 1.0 NaN 123.611111 98.395834 92.000000 81.666667 -0.691035 0.379131 -0.387419 0.217508 cl ei This data stage L1B is what can be used to create a different significance threshold cut for the final data , by filtering on the data column vote_ratio in the fnotch file for the required threshold value. For example, if a higher threshold on the probability for a fan is wanted, e.g. 0.8, one would filter out all rows that start with “fan” with a vote_ratio value below 0.8. One then needs to decide if one wants to use this threshold as a general “certainty” filter and simply don’t take any object with a vote_ratio < 0.8, or if one wants the blotch to appear instead of a fan. Appendix D.2.3. Level 1C This level contains the data of the final catalog files, but split-up into each Planet Four tiles. At the end of the thresholding stage (Section 4.3), appending the data for the rows that pass the threshold filters into the respective blotch and fan files and copying these completed files into the L1C directory completes that thresholding step and fills up the L1C folders. A final tool walks through each folder and collects all the fan and blotch data into one summary file each, followed by merge operations with meta-data that is useful for future analysis. These files are described in the next section, Appendix E. rti A spread version es image_x marking_id n_votes image_name Pr fan blotch fan blotch distance image_id n angle 63 Appendix E. Planet Four Catalog files description es Our catalog product files consist of one CSV result file per fan and blotch markings, a Planet Four tile meta-data file, and a HiRISE observation meta-data file. Below, each subsection describes the data columns for these files. For convenience we provide both the planeto-centric and planeto-graphic latitudes for each fan’s base and blotch’s center point. Longitudes are measured 0–360, increasing positive to the East. Note that, because the HiRISE images were not co-registered, the conversion of pixel to geographical coordinates can be offset by up to 100 HiRISE pixels between data from different HiRISE images. Column name Example value marking_id F00004ab Pr Appendix E.1. Fan catalog Description A rti cl ei n Consistent identifier for marking after clustering. Fxxx=Fan, Bxxx=Blotch angle 185.4 Alignment angle of marking measured from 3 o’clock direction, clockwise distance 179.6 Length of fan in pixels tile_id APF0000cia tile identifier in the Planet Four system image_x 3391.2 Base X coordinate [px] in original HiRISE image image_y 5640.6 Base Y coordinate [px] in original HiRISE image n_votes 15 # of markings that went into this average object. obsid ESP_012079_ HiRISE image observation id 0945 spread 21.346 Spreading angle of Fans version 1 Version number of Fan model used in Planet Four (see Appendix Appendix C) vote_ratio 1.0 Ratio of votes from a potential combination step. Value of 1.0 means only fan votes occurred. x 431.206 Base X pixel coordinate in the Planet Four tile y 160.6 Base Y pixel coordinate in the Planet Four tile x_angle -0.995088 Polar X coordinate of alignment angle y_angle -0.0938355 Polar Y coordinate of alignment angle l_s 214.785 Solar longitude of HiRISE observation map_scale 0.25 Factor for scaling distances to correct for HiRISE binning mode north_azimuth 126.857 Direction of North in the original unprojected HiRISE input image BodyFixedCoordinateX -67.2071 Base X coord. [km] in Mars-fixed ref. frame BodyFixedCoordinateY 257.05 Base Y coord. [km] in Mars-fixed ref. frame BodyFixedCoordinateZ -3370.63 Base Z coord. [km] in Mars-fixed ref. frame 64 Appendix E.2. Blotch catalog Column name Example value marking_id B00004ab image_y n_votes obsid cl ei radius_1 radius_2 vote_ratio Consistent identifier for marking after clustering. Fxxx=Fan, Bxxx=Blotch 185.4 Alignment angle of marking measured from 3 o’clock direction, clockwise APF0000cia tile identifier in the Planet Four system 3391.2 Center X pixel coordinate in the original HiRISE image 5640.6 Center Y pixel coordinate in the original HiRISE image 15 Number of markings used for the average object ESP_012079_ HiRISE image observation id 0945 10.4 Semi-major axis of Blotch 15.2 Semi-minor axis of Blotch 0.0 Ratio of votes from a potential combination step. Value of 0.0 means only blotch votes occurred. 431.206 Center X pixel coordinate in the Planet Four tile 160.6 Center Y pixel coordinate in the Planet Four tile -0.995088 Polar X coordinate of alignment angle -0.0938355 Polar Y coordinate of alignment angle 214.785 Solar longitude of HiRISE observation 0.25 Factor for scaling distances to correct for HiRISE binning mode 126.857 Direction of North in the original unprojected HiRISE input image -67.2071 Center X coord. [km] in Mars-fixed ref. frame 257.05 Center Y coord. [km] in Mars-fixed ref. frame -3370.63 Center Z coord. [km] in Mars-fixed ref. frame -85.493 Latitude of catalog object (-centric) -85.5457 Latitude of catalog object (-graphic) 104.652 Longitude of catalog object (Positive East 360) Pr tile_id image_x Description n angle x y x_angle y_angle l_s map_scale north_azimuth rti A Latitude of catalog object (-centric) Latitude of catalog object (-graphic) Longitude of catalog object BodyFixedCoordinateX BodyFixedCoordinateY BodyFixedCoordinateZ PlanetocentricLatitude PlanetographicLatitude Longitude es PlanetoCentricLatitude -85.493 PlanetoGraphicLatitude -85.5457 Longitude 104.652 65 Example value Description BodyFixedCoordinateX BodyFixedCoordinateY BodyFixedCoordinateZ PlanetocentricLatitude PlanetographicLatitude Longitude tile_id obsid x_hirise -67.2071 257.05 -3370.63 -85.493 -85.5457 104.652 APF0000cia PSP_003092_ 0985 840 x_tile 5 y_hirise 648 y_tile 11 Center X coord. [km] in Mars-fixed ref. frame Center Y coord. [km] in Mars-fixed ref. frame Center Z coord. [km] in Mars-fixed ref. frame Latitude of catalog object (-centric) Latitude of catalog object (-graphic) Longitude of catalog object (Positive East 360) tile identifier in the Planet Four system HiRISE observation ID of the source image for this tile X pixel coordinate of the tile center in the HiRISE image X index of the Planet Four tile inside the HiRISE image (1-based) Y pixel coordinate of the tile center in the HiRISE image Y index of the Planet Four tile inside the HiRISE image (1-based) cl ei n Column name Appendix E.4. HiRISE observations catalog This catalog provides the user with a list of HiRISE images and their meta-data that were used to create the Planet Four results presented here. The columns with capital letters were directly taken from the published cumulative EDR index10 . The decimal digits precision was set to 7, guided by the Latitude/Longitude significant bits for a HiRISE pixel diameter on the ground for a 1x1 binning observation. rti A Pr es Appendix E.3. Planet Four tile catalog Here we provide the data required to position the Planet Four tiles both back into HiRISE images, if so required, or directly onto the Martian surface, by using the provided latitude/longitude values or their map-value equivalents in the BodyFixed-Mars frame in a rectangular coordinate system, measuring kilometers from the south pole. The coordinate values come directly from the ISIS campt utility, while the x_tile and y_tile position indices of tiles inside the HiRISE image are the result of the splitting up routine that was developed by the Zooniverse team at the beginning of the project. All coordinates were calculated at the tile center pixel coordinate of (420, 324). The decimal digits precision was set to 7, guided by the Latitude/Longitude significant bits for a HiRISE pixel diameter on the ground for a 1x1 binning observation. Column name Example value Description 10 https://hirise-pds.lpl.arizona.edu/PDS/INDEX/EDRCUMINDEX.TAB 66 OBSERVATION_ID IMAGE_CENTER_LATITUDE ESP_011296_0975 -82.1965000 n Pr es HiRISE observation identifier Planetographic latitude of the HiRISE image center IMAGE_CENTER_LONGITUDE 225.2530000 Longitude of HiRISE image center (positive west 360) SOLAR_LONGITUDE 178.8330000 Solar longitude of HiRISE image. Equivalent to column l_s in the fan and blotch catalogs. START_TIME 2008-12-23 16:15:26 UTC time of observation start map_scale 1.0000000 Units: pixel/m. Calculated from EDRCUMINDEX by 0.25*BINNING north_azimuth 110.6001067 The median north azimuth value for the HiRISE image, recalculated with ISIS’ campt, due to known errors in HiRISE EDR index file. # of tiles 91 the number of created Planet Four tiles per HiRISE observation. Depends on original image size. Appendix F. Extended validation results A rti cl ei In addition to the combined fan and blotch count we explored in Section 5, we further explore here how well the Planet Four catalog identifies fans (those dark sources with a clear direction and starting point) versus blotches, separately. We separate the catalog and gold standard classifications by marker type in Figures F.41 to F.44. The data processing pipeline plays a significant role in the completeness of the catalog. At the Thresholding stage, our data processing algorithm determines which clusters will ultimately become fans with a value of P(fan) > 0.5. Like for the total number of sources, the number distribution of fans and the number distribution of blotches matches the expert assessments and is within the 3-σ uncertainty [Kraft et al., 1991]. Thus, in most cases where the science team member marked a fan, the catalog also identifies this source as fan. Based on these results, we have high confidence in our fan and blotches identifications within the Planet Four catalog. 67 es Pr 101 cl ei 100 0 10 20 30 # of fans per Planet Four tile 40 50 Figure F.41: Comparing numbers of identified fans per Planet Four tile between experts and the catalog data; here, for the 192 tile_ids that were classified by all experts. Bin size is 5, each bin is directly compared between the data from all experts GP (blue), MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut off at 60, ommiting single entry bins above. rti A GP MES KMA catalog n # of tiles Common Expert data vs Catalog: Fans only 68 es Expert vs Catalog object identification frequency: Fans only GP catalog 101 100 0 10 20 30 40 60 70 101 cl ei 100 0 10 20 30 40 50 60 70 102 # of tiles 80 MES catalog 80 KMA catalog 101 100 0 10 20 30 40 50 # of fans per Planet Four tile 60 70 80 Figure F.42: Comparing numbers of identified fans per Planet Four tile between experts and the catalog data. Bin size is 5, each bin is directly compared between the data from all experts GP (blue), MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut off at 85, ommiting single entry bins above. rti A 50 n # of tiles 102 Pr # of tiles 102 69 GP MES KMA catalog 101 100 0 10 30 40 50 Pr Figure F.43: Comparing numbers of identified blotches per Planet Four tile between experts and the catalog data; here, for the 192 tile_ids that were classified by all experts. Bin size is 5, each bin is directly compared between the data from all experts GP (blue), MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut off at 60, ommiting single entry bins above. Expert vs Catalog object identification frequency: Blotches only GP catalog 101 100 0 10 20 30 40 50 cl ei # of tiles 102 n # of tiles 102 60 70 80 MES catalog 101 100 0 10 20 30 40 50 60 70 # of tiles 80 KMA catalog 102 101 rti A 20 # of blotches per Planet Four tile es # of tiles Common Expert data vs Catalog: Blotches only 100 0 10 20 30 40 50 # of blotches per Planet Four tile 60 70 80 Figure F.44: Comparing numbers of identified blotches per Planet Four tile between experts and the catalog data. Bin size is 5, each bin is directly compared between the data from all experts GP (blue), MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut off at 85, ommiting single entry bins above. 70 cl ei n Pr es Appendix F.1. Example tile comparisons In Figures F.45 and F.46 we show an example comparison of volunteer’s markings with those performed by the science team. The aforementiend slight deviations of the science team members with each other is visible, however, it is clear that the catalog wind directions in Fig. F.45 are well reproduced by both the specialists and the volunteers. The results for blotches in Fig. F.46 are very comparable, with the added simplification that blotches have a much reduced directivity compared to fans. A rti Figure F.45: Comparing volunteers’ markings and the resulting clustering with the markings performed by science team members for Planet Four tile ID APF0000hqn of HiRISE image ESP_012316_0925. The extended fan center lines are 3 times exaggerated fan lengths to indicate the general trend of fan directions for easy visual comparison. The derived wind directions compare very well between the catalog and the science team data. 71 A es Pr n cl ei rti Figure F.46: Comparing volunteers’ markings and the resulting clustering with the markings performed by science team members for Planet Four tile APF000018t of HiRISE image ESP_012889_0985. The blotches are very well comparable between the science team and the volunteers, with slight disagreements between the science team members. 72 References A rti cl ei n Pr es Alger, M.J., Banfield, J.K., Ong, C.S., Rudnick, L., Wong, O.I., Wolf, C., Andernach, H., Norris, R.P., Shabala, S.S., 2018. Radio galaxy zoo: machine learning for radio source host galaxy cross-identification. Mon. Not. R. Astron. Soc. 478, 5547–5563. Anderson, J.A., Sides, S.C., Soltesz, D.L., Sucharski, T.L., Becker, K.J., 2004. Modernization of the Integrated Software for Imagers and Spectrometers, in: Mackwell, S., Stansbery, E. (Eds.), Lunar and Planetary Science Conference, p. 2039. Aye, K.M., Portyankina, G., Thomas, N., 2010. Semi-Automatic Measures of Activity in the Inca City Region of Mars Using Morphological Image Analysis, p. 2707. URL: http://adsabs.harvard.edu/cgi-bin/nph-data_ query?bibcode=2010LPI....41.2707A&link_type=ABSTRACT. Banerji, M., Lahav, O., Lintott, C.J., Abdalla, F.B., Schawinski, K., Bamford, S.P., Andreescu, D., Murray, P., Jordan Raddick, M., Slosar, A., Szalay, A., Thomas, D., Vandenberg, J., 2010. Galaxy zoo: reproducing galaxy morphologies via machine learning. Mon. Not. R. Astron. Soc. 406, 342–353. Becker, K.J., Anderson, J.A., Sides, S.C., Miller, E.A., Eliason, E.M., Keszthelyi, L.P., 2007. Processing HiRISE Images Using ISIS3, in: Lunar and Planetary Science Conference, p. 1779. Bird, R., Daniel, M.K., Dickinson, H., Feng, Q., Fortson, L., Furniss, A., Jarvis, J., Mukherjee, R., Ong, R., Sadeh, I., Williams, D., 2018. Muon hunter: a zooniverse project arXiv:1802.08907. Bowley, C., Mattingly, M., Barnas, A., Ellis-Felege, S., Desell, T., 2018. Detecting wildlife in unmanned aerial systems imagery using convolutional neural networks trained with an automated feedback loop, in: Computational Science – ICCS 2018, Springer International Publishing. pp. 69–82. Bugiolacchi, R., Bamford, S., Tar, P., Thacker, N., Crawford, I.A., Joy, K.H., Grindrod, P.M., Lintott, C., 2016. The Moon Zoo citizen science project: Preliminary results for the Apollo 17 landing site. Icarus 271, 30–48. doi:10.1016/j.icarus.2016.01.021. Clancy, R., Sandor, B., Wolff, M., 2000. An intercomparison of ground-based millimeter, MGS TES, and Viking atmospheric temperature measurements- Seasonal and interannual variability of temperatures . . . . Journal of geophysical . . . URL: http://www.agu.org/journals/je/je0004/1999JE001089/pdf/1999JE001089.pdf. Crowston, K., Fagnot, I., 2008. The motivational arc of massive virtual collaboration, in: Proceedings of the IFIP WG 9.5 Working Conference on Virtuality and Society: Massive Virtual Communities, Lüneberg, Germany. de Villiers, S., Nermoen, A., Jamtveit, B., Mathiesen, J., Meakin, P., Werner, S.C., 2012. Formation of Martian araneiforms by gas-driven erosion of granular material. Geophysical Research Letters 39, L13204. doi:10.1029/ 2012GL052226. Ester, M., Kriegel, H.P., Sander, J., Xu, X., 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd URL: http://www.aaai.org/Papers/KDD/1996/KDD96-037.pdf. Ewing, R.C., Peyret, A.P.B., Kocurek, G., Bourke, M., 2010. Dune field pattern formation and recent transporting winds in the olympia undae dune field, north polar region of mars. J. Geophys. Res. 115, E11007. Fischer, D.A., Schwamb, M.E., Schawinski, K., Lintott, C., Brewer, J., Giguere, M., Lynn, S., Parrish, M., Sartori, T., Simpson, R., Smith, A., Spronck, J., Batalha, N., Rowe, J., Jenkins, J., Bryson, S., Prsa, A., Tenenbaum, P., Crepp, J., Morton, T., Howard, A., Beleu, M., Kaplan, Z., vanNispen, N., Sharzer, C., DeFouw, J., Hajduk, A., Neal, J.P., Nemec, A., Schuepbach, N., Zimmermann, V., 2012. Planet Hunters: The first two planet candidates identified by the public using the Kepler public archive data. Monthly Notices of the Royal Astronomical Society 419, 2900–2911. doi:10.1111/j.1365-2966.2011.19932.x. Fortson, L., Masters, K., Nichol, R., Borne, K.D., Edmondson, E.M., Lintott, C., Raddick, J., Schawinski, K., Wallin, J., 2012. Galaxy Zoo: Morphological Classification and Citizen Science, in: Way, M.J., Scargle, J.D., Ali, K.M., Srivastava, A.N. (Eds.), Advances in Machine Learning and Data Mining for Astronomy. Chapman & Hall/CRC. Data mining and Knowledge Discovery, pp. 213–236. Greeley, R., Arvidson, R.E., Barlett, P.W., Blaney, D., Cabrol, N.A., Christensen, P.R., Fergason, R.L., Golombek, M.P., Landis, G.A., Lemmon, M.T., Others, 2006. Gusev crater: Wind-related features and processes observed by the mars exploration rover spirit. Journal of Geophysical Research: Planets 111. Hansen, C.J., Thomas, N., Portyankina, G., McEwen, A., Becker, T., Byrne, S., Herkenhoff, K., Kieffer, H., Mellon, M., 2010. HiRISE observations of gas sublimation-driven activity in Mars’ southern polar regions: I. Erosion of the surface. Icarus 205, 283–295. doi:10.1016/j.icarus.2009.07.021. 73 A rti cl ei n Pr es Hansen, G.B., 2005. Ultraviolet to near-infrared absorption spectrum of carbon dioxide ice from 0.174 to 1.8 mm. Journal of Geophysical Research 110, E11003. doi:10.1029/2005JE002531. Hunter, J.D., 2007. Matplotlib: A 2D Graphics Environment. Comput. Sci. Eng. 9, 90–95. doi:10.1109/mcse. 2007.55. Jones, E., Oliphant, T., Peterson, P., 2001. {SciPy}: Open source scientific tools for {Python} URL: http://www. scipy.org. Kaufmann, E., Hagermann, A., 2016. Experimental investigation of insolation-driven dust ejection from Mars’ CO 2 ice caps. Icarus 282, 118–126. doi:10.1016/j.icarus.2016.09.039. Kerber, L., Dickson, J.L., Head, J.W., Grosfils, E.B., 2017. Polygonal ridge networks on Mars: Diversity of morphologies and the special case of the Eastern Medusae Fossae Formation. Icarus 281, 200–219. doi:10.1016/j. icarus.2016.08.020. Kieffer, H.H., 2007. Cold jets in the Martian polar caps. Journal of Geophysical Research 112, 08005. doi:10.1029/ 2006JE002816. Kraft, R.P., Burrows, D.N., Nousek, J.A., 1991. Determination of confidence limits for experiments with low numbers of counts. The Astrophysical Journal 374, 344–355. doi:10.1086/170124. Leighton, R.B., Murray, B.C., 1966. Behavior of Carbon Dioxide and Other Volatiles on Mars. Science 153, 136–144. doi:10.1126/science.153.3732.136. Lintott, C., Schawinski, K., Bamford, S., Slosar, A., Land, K., Thomas, D., Edmondson, E., Masters, K., Nichol, R.C., Raddick, M.J., Szalay, A., Andreescu, D., Murray, P., Vandenberg, J., 2011. Galaxy Zoo 1: Data release of morphological classifications for nearly 900 000 galaxies. Monthly Notices of the Royal Astronomical Society 410, 166–178. doi:10.1111/j.1365-2966.2010.17432.x. Lintott, C.J., Schawinski, K., Slosar, A., Land, K., Bamford, S., Thomas, D., Raddick, M.J., Nichol, R.C., Szalay, A., Andreescu, D., Murray, P., Vandenberg, J., 2008. Galaxy Zoo: Morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey. Monthly Notices of the Royal Astronomical Society 389, 1179–1189. doi:10.1111/j.1365-2966.2008.13689.x. Marshall, P.J., Lintott, C.J., Fletcher, L.N., 2014. Ideas for Citizen Science in Astronomy. ArXiv e-prints arXiv:1409.4291. McEwen, A.S., Eliason, E.M., Bergstrom, J.W., Bridges, N.T., Hansen, C.J., Delamere, W.A., Grant, J.A., Gulick, V.C., Herkenhoff, K.E., Keszthelyi, L., Kirk, R.L., Mellon, M.T., Squyres, S.W., Thomas, N., Weitz, C.M., 2007. Mars Reconnaissance Orbiter’s High Resolution Imaging Science Experiment (HiRISE). Journal of Geophysical Research: Planets 112, E05S02. doi:10.1029/2005JE002605. McKinney, W., 2010. Data Structures for Statistical Computing in Python, in: van der Walt, S., Millman, J. (Eds.), Proceedings of the 9th Python in Science Conference, pp. 51–56. Newman, C.E., Gómez-Elvira, J., Marin, M., Navarro, S., Torres, J., Richardson, M.I., Battalio, J.M., Guzewich, S.D., Sullivan, R., de la Torre, M., Others, 2017. Winds measured by the rover environmental monitoring station (REMS) during the mars science laboratory (MSL) rover’s bagnold dunes campaign and comparison with numerical modeling using MarsWRF. Icarus 291, 203–231. Nguyen, T., Pankratius, V., Eckman, L., Seager, S., 2018. Computer-aided discovery of debris disk candidates: A case study using the Wide-Field infrared survey explorer (WISE) catalog. Astronomy and Computing 23, 72–82. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E., 2011. Scikitlearn: Machine Learning in Python. Journal of Machine Learning Research 12, 2825–2830. Peng, T.r., English, J.E., Silva, P., Davis, D.R., Hayes, W.B., 2018. SpArcFiRe: morphological selection effects due to reduced visibility of tightly winding arms in distant spiral galaxies. Mon. Not. R. Astron. Soc. 479, 5532–5543. Perez, F., Granger, B.E., 2007. IPython: A System for Interactive Scientific Computing. Computing in Science Engineering 9, 21–29. doi:10.1109/MCSE.2007.53. Piqueux, S., Byrne, S., Kieffer, H.H., Titus, T.N., Hansen, C.J., 2015. Enumeration of Mars years and seasons since the beginning of telescopic exploration. Icarus 251, 332–338. doi:10.1016/j.icarus.2014.12.014. Piqueux, S., Byrne, S., Richardson, M.I., 2003a. Polygonal Landforms at the South Pole and Implications for Exposed Water Ice, in: Sixth International Conference on Mars, p. 3275. URL: http://adsabs.harvard.edu/cgi-bin/ nph-data_query?bibcode=2003mars.conf.3275P&link_type=ABSTRACT. 74 A rti cl ei n Pr es Piqueux, S., Byrne, S., Richardson, M.I., 2003b. Sublimation of Mars’s southern seasonal CO2 ice cap and the formation of spiders. Journal of Geophysical Research 108, 5084. doi:10.1029/2002JE002007. Pommerol, A., Portyankina, G., Thomas, N., Aye, K.M., Hansen, C.J., Vincendon, M., Langevin, Y., 2011. Evolution of south seasonal cap during Martian spring: Insights from high-resolution observations by HiRISE and CRISM on Mars Reconnaissance Orbiter. Journal of Geophysical Research 116, E08007. doi:10.1029/2010JE003790. Robbins, S.J., Antonenko, I., Kirchoff, M.R., Chapman, C.R., Fassett, C.I., Herrick, R.R., Singer, K., Zanetti, M., Lehan, C., Huang, D., Gay, P.L., 2014. The variability of crater identification among expert and community crater analysts. Icarus 234, 109–131. doi:10.1016/j.icarus.2014.02.022. Sauermann, H., Franzoni, C., 2015. Crowd science user contribution patterns and their implications. Proceedings of the National Academy of Sciences 112, 679–684. doi:10.1073/pnas.1408907112. Schwamb, M.E., Aye, K.M., Portyankina, G., Hansen, C., Lintott, C.J., Allen, C., Allen, S., Calef, F.J., Duca, S., McMaster, A., R. M Miller, G., 2017a. Discovery of araneiforms outside of the South Polar Layered Deposits, p. 422.05. URL: http://adsabs.harvard.edu/abs/2017DPS....4942205S. Schwamb, M.E., Aye, K.M., Portyankina, G., Hansen, C.J., Allen, C., Allen, S., Calef, F.J., Duca, S., McMaster, A., Miller, G.R.M., 2017b. Planet Four: Terrains – Discovery of araneiforms outside of the South Polar layered deposits. Icarus doi:10.1016/j.icarus.2017.06.017. Schwamb, M.E., Lintott, C.J., Fischer, D.A., Giguere, M.J., Lynn, S., Smith, A.M., Brewer, J.M., Parrish, M., Schawinski, K., Simpson, R.J., 2012. Planet Hunters: Assessing the Kepler Inventory of Short-period Planets. The Astrophysical Journal 754, 129. doi:10.1088/0004-637X/754/2/129. Simpson, R.J., Povich, M.S., Kendrew, S., Lintott, C.J., Bressert, E., Arvidsson, K., Cyganowski, C., Maddison, S., Schawinski, K., Sherman, R., Smith, A.M., Wolf-Chase, G., 2012. The Milky Way Project First Data Release: A bubblier Galactic disc. Monthly Notices of the Royal Astronomical Society 424, 2442–2460. doi:10.1111/j. 1365-2966.2012.20770.x. Smith, D.E., Zuber, M.T., Neumann, G.A., 2001. Seasonal Variations of Snow Depth on Mars. Science 294, 2141– 2146. doi:10.1126/science.1066556. Smith, I.B., Spiga, A., Holt, J.W., 2015. Aeolian processes as drivers of landform evolution at the South Pole of Mars. Geomorphology 240, 54–69. doi:10.1016/j.geomorph.2014.08.026. Thomas, N., Hansen, C.J., Portyankina, G., Russell, P.S., 2010. HiRISE observations of gas sublimation-driven activity in Mars’ southern polar regions: II. Surficial deposits and their origins. Icarus 205, 296–310. doi:10. 1016/j.icarus.2009.05.030. Thomas, N., Portyankina, G., Hansen, C.J., Pommerol, A., 2011. Sub-surface CO2 gas flow in Mars’ polar regions: Gas transport under constant production rate conditions. Geophysical Research Letters 38, L08203–n/a. doi:10. 1029/2011GL046797. Willett, K.W., Lintott, C.J., Bamford, S.P., Masters, K.L., Simmons, B.D., Casteels, K.R.V., Edmondson, E.M., Fortson, L.F., Kaviraj, S., Keel, W.C., Melvin, T., Nichol, R.C., Raddick, M.J., Schawinski, K., Simpson, R.J., Skibba, R.A., Smith, A.M., Thomas, D., 2013. Galaxy Zoo 2: Detailed morphological classifications for 304 122 galaxies from the Sloan Digital Sky Survey. Monthly Notices of the Royal Astronomical Society 435, 2835–2860. doi:10.1093/mnras/stt1458. Zachte, E., 2012. Wikipedia Statistics Tables English. URL: http://stats.wikimedia.org/EN/ TablesWikipediaEN.htm. 75