Planet Four: Probing Springtime Winds on Mars by Mapping the
Southern Polar CO2 Jet Deposits
es
K.-Michael Ayea,∗, Megan E. Schwambb,c,d,e , Ganna Portyankinaa , Candice J. Hansenf , Adam
McMasterh , Grant R.M. Millerh , Brian Carstenseni , Christopher Snyderi , Michael Parrishi , Stuart
Lynni , Chuhong Maic,g , David Milleri , Robert J. Simpsonh , Arfon M. Smithi,j
a Laboratory
for Atmospheric and Space Physics, University of Colorado at Boulder, Boulder, CO 80303, USA
Observatory, Northern Operations Center, 670 North A’ohoku Place, Hilo, HI 96720, USA
c Institute for Astronomy and Astrophysics, Academia Sinica; 11F AS/NTU, National Taiwan University, 1 Roosevelt
Rd., Sec. 4, Taipei 10617, Taiwan
d Yale Center for Astronomy and Astrophysics, Yale University,P.O. Box 208121, New Haven, CT 06520, USA
e Department of Physics, Yale University, New Haven, CT 06511, USA
f Planetary Science Institute, 1700 E. Fort Lowell, Suite 106, Tucson, AZ 85719, USA
g School of Earth and Space Exploration, Arizona State University, Tempe, AZ 85287, USA
h Oxford Astrophysics, Denys Wilkinson Building, Keble Road, Oxford OX1 3RH, UK
i Adler Planetarium, 1300 S. Lake Shore Drive, Chicago, IL 60605, USA
j Space Telescope Science Institute, 3700 San Martin Drive, Baltimore, MD 21218, USA
Abstract
n
Pr
b Gemini
A
rti
cl
ei
The springtime sublimation process of Mars’ southern seasonal polar CO2 ice cap features dark
fan-shaped deposits appearing on the top of the thawing ice sheet. The fan material likely originates from the surface below the ice sheet, brought up via CO2 jets breaking through the seasonal
ice cap. Once the dust and dirt is released into the atmosphere, the material may be blown by the
surface winds into the dark streaks visible from orbit. The location, size and direction of these fans
record a number of parameters important to quantifying seasonal winds and sublimation activity,
the most important agent of geological change extant on Mars. We present results of a systematic
mapping of these south polar seasonal fans with the Planet Four online citizen science project.
Planet Four enlists the general public to map the shapes, directions, and sizes of the seasonal fans
visible in orbital images. Over 80,000 volunteers have contributed to the Planet Four project,
reviewing 221 images, from Mars Reconnaissance Orbiter’s HiRISE (High Resolution Imaging
Science Experiment) camera, taken in southern spring during Mars Years 29 and 30. We provide
an overview of Planet Four and detail the processes of combining multiple volunteer assessments
together to generate a high fidelity catalog of ∼400,000 south polar seasonal fans. We present
the results from analyzing the wind directions at several locations monitored by HiRISE over two
Mars years, providing new insights into polar surface winds.
Keywords: Mars, atmosphere, Mars, polar caps, Mars, surface, Mars, polar geology
∗ Corresponding
author
Email address: michael.aye@lasp.colorado.edu (K.-Michael Aye)
Preprint submitted to Elsevier Journal
September 6, 2018
1. Introduction
Pr
es
Mars has a predominantly CO2 atmosphere with pressure levels buffered by seasonal CO2
polar caps [Leighton and Murray, 1966]. In the winter atmospheric CO2 falls as snow or condenses
directly onto the surface, forming a seasonal ice layer with a thickness of up to 1 m, depending
on the latitude. In the spring the south polar region of Mars exhibits a host of exotic phenomena
associated with sublimation of the seasonal CO2 polar cap, and sublimation winds [Smith et al.,
2001] contribute to atmospheric circulation.
In the south polar region images from the Mars Reconnaissance Orbiter (MRO) High Resolution Imaging Science Experiment (HiRISE, McEwen et al. [2007]) document activity best described by the “Kieffer” model [Hansen et al., 2010; Kieffer, 2007; Piqueux et al., 2003a]:
1. Over the winter CO2 anneals to form a translucent slab of impermeable ice. Penetration of
sunlight through the CO2 ice, which warms the ground below, results in basal sublimation
of the ice.
n
2. The laboratory measurements done by Hansen [2005] show that up to 70 % of the solar
energy that reaches the top surface of a 1 m thick slab layer can be transmitted through it.
Recent laboratory experiments by Kaufmann and Hagermann [2016] were able to trigger
dust eruptions from a layer of dust inside a CO2 ice slab under Martian conditions, lending
further credence to the proposed CO2 jet and fan production model.
cl
ei
3. Trapped gas escapes through ruptures in the ice, eroding and entraining material from the
surface below [de Villiers et al., 2012].
4. When this dust-laden gas is expelled into the atmosphere the dust settles in fan-shaped deposits on the top of the ice in directions oriented by the ambient wind, as shown in Figure 1
[Thomas et al., 2010, 2011].
5. When the layer of seasonal ice sublimates in summer, the fans fade, as the material mostly
blends back into the surface [Hansen et al., 2010].
6. The compressed CO2 gas streams of the jets are believed to erode the surface, carving
uniquely Martian spidery channels originally identified in images from the Mars Orbiter
Camera [Piqueux et al., 2003b], now referred to as araneiforms [Hansen et al., 2010].
A
rti
The number, time history, area covered and changes in direction of the fans provide a wealth
of information on the spring sublimation process and spring winds. Apart from few wind direction
estimations from remotely observed dunes [Ewing et al., 2010] and surface rover wind measurements [Greeley et al., 2006; Newman et al., 2017], no wide spread wind measurements exist for
Mars. The science goals enabled by cataloging fan measurements fall into two categories:
1. Enhance our understanding of spring winds and provide constraints for global and mesoscale
circulation models. The length, width, and direction of these fans are snapshots in time of
the local wind direction. Changes in the orientation of the fans over time records changes in
wind direction. These markers can be compared to predictions from global and mesoscale
2
circulation models (e.g. Smith et al. [2015]) to improve our understanding of Mars’ weather
in the polar regions. Dust injected into the atmosphere can be estimated.
Pr
es
2. Extend our understanding of the sublimation process and its efficacy as an agent of change
on the Martian surface. The number of fans as a function of time record sublimation activity
while the overlying ice thickness and insolation change during the season. The areal coverage of the fans allows us (with reasonable assumptions about particle size) to estimate the
amount of material eroded from the surface on seasonal timescales. Inter-annual variability
and the relationship of timing of seasonal activity to global dust storms can be quantified
with this data-set (These are topics of future papers).
A
rti
cl
ei
n
Although the value of this data-set is clear, the sheer number of fans (on the order of hundreds
of thousands) present in HiRISE images from multiple locations and times observed over many
Mars years has proven to be a daunting data-set to catalog. Attempts at developing automated detection algorithms have been unsuccessful at identifying the locations and shapes of these seasonal
fans in images from orbit in a reliable fashion [Aye et al., 2010]. However, there is an increasing
interest to use the outcomes of Citizen Science projects as training data for neural networks (e.g.
Alger et al. [2018]; Banerji et al. [2010]; Bird et al. [2018]; Bowley et al. [2018]; Nguyen et al.
[2018]; Peng et al. [2018]), hence we believe that these two lines of research will become strongly
complimentary in the near future.
The task of mapping the dark fans is simply pattern recognition, and the human brain is ideally
suited for this task, easily capable of spotting and outlining these features. With the advent of the
Internet, tens of thousands of people across the globe can be enlisted to assist scientists with tasks
that are impossible to automate. This citizen science or crowd-sourcing approach, where independent assessments from multiple non-expert classifiers are combined, has become an established
technique as the data volumes have continued to grow. This method has been applied to nearly
all areas in astronomy and planetary science [Marshall et al., 2014] (see reference therein) including galaxy morphology [Lintott et al., 2008; Willett et al., 2013], identification of planet transits
[Fischer et al., 2012; Schwamb et al., 2012], crater counting [Bugiolacchi et al., 2016; Robbins
et al., 2014] and to a sister project of the here presented efforts, Planet Four: Terrains [Schwamb
et al., 2017b]. In collaboration with the Zooniverse1 [Fortson et al., 2012; Lintott et al., 2011], the
largest collection of online citizen science projects, we have developed Planet Four2 , a web portal
to enlist the general public to identify and map the seasonal fans in HiRISE images of Mars’ polar
regions.
In this paper we present the first results from the Planet Four project, a catalog of seasonal fans
from two Mars years, MY 29 and 30, of HiRISE monitoring of the Martian South Polar region.
In Section 2, we provide an overview of the HiRISE South Pole Seasonal Processes Monitoring
Campaign and the specific HiRISE observations used in this study. In Section 3, we present the
Planet Four project and the online classification interface. Section 4 details the process for assessing and combining the volunteer classifications to create a catalog of seasonal features. In
1 http://www.zooniverse.org
2 http://www.planetfour.org
3
es
Pr
n
cl
ei
Figure 1: Subsection of HiRISE image ESP_011960_0925, taken at (LAT, LON) −87.303°, 167.970°; Ls 209.1°.
The image is approximately 321.4 m long and 416.6 m wide
Section 5 we examine our catalog’s validity by comparing results between volunteers and science
team members. Section 6 presents general statistical results of the catalog, and finally, we use the
catalog for an initial probing into regional winds in Section 7. We summarize our conclusions in
Section 8. All place names referred to in this paper are informal and not approved by the International Astronomical Union. Full machine-readable versions of the catalogs and tables presented
in this paper are also available from https://www.planetfour.org/results.
2. HiRISE Instrument and Seasonal Processes Monitoring Campaign
A
rti
The Mars Reconnaissance Orbiter (MRO) has the ability to turn off nadir to target a specific
location. In its inclined orbit there are numerous opportunities to achieve repeat coverage in the
polar region. In order to study seasonal processes the HiRISE team selected a limited number of
regions of interest (ROIs) in the Martian south polar region to image throughout the spring season.
Time is defined on Mars by the orbital longitude Ls , where southern spring begins at Ls =180°.
Originally, the HiRISE monitoring campaigns were numbered by their ordinal number of seasons the MRO mission had been observing Mars. This work focuses on the observations from
seasons 2 and 3 which have more regular repeat HiRISE imaging of ROIs over multiple years,
compared to season 1 HiRISE monitoring campaign. To be able to compare with other missions
and modeling, we also identify our data using the convention of Martian years, established by
Clancy et al. [2000] and Piqueux et al. [2015], where Mars Years 29 and 30, also written as MY29
4
A
rti
cl
ei
n
Pr
es
and MY30, correspond to HiRISE seasons 2 and 3. Every day, citizen scientists are making more
fan measurements for later Mars years and the catalog continues to grow. The longer timespan
covered by the catalog will be discussed in future paper(s).
Figures 4 and 5 provide an overview of the observed locations and times in solar longitudes of
the HiRISE data used in this work. Table 1 lists the ROIs selected for analysis using Planet Four.
221 high quality images from southern spring season 2 and 3 (i.e. MY 29 and 30) were selected
for analysis on Planet Four (see Table 2). The reduced HiRISE products were obtained from the
National Aeronautics and Space Administration’s (NASA) Planetary Data System (PDS) HiRISE
PDS Data Node3 .
HiRISE is a pushbroom imager. It has ten 2048-pixel detectors in the cross-track direction,
which covers ∼6 km at the spacecraft altitude of 300 km (MRO is in an elliptical 255 km by 320 km
orbit). An image is built up in the along-track dimension as the spacecraft travels in its orbit, with
a ground velocity of ∼3 km s−1 . A typical size image has ∼60,000 pixels along-track, thus covers
a (6 × 18) km2 area. Color is available in the center 20 % of the image. A full description of the
camera is found in McEwen et al. [2007].
It is generally easier to identify the fans in the color portion of the image, so only the ∼1 km
wide color (RGB) sub-image was used for the Planet Four image set. A visitor to the Planet Four
website is presented with a sub-image from a RGB non-mapped projected HiRISE image. Each
HiRISE frame (typically several hundred megabytes in size) is divided into 840 × 648 pixel subimages that we will refer to as “tiles”. To avoid edge effects, the tiles are generated such that there
is a 100-pixel overlap with the neighboring tiles. We avoid showing volunteers tiles where part or
most of the tile is blank. Due to the variable length and width of HiRISE images, there is typically
a small region on the right and bottom edges of the non-map projected HiRISE image that cannot
be made into a full-sized tile and thus is not searched for seasonal features with Planet Four. Pixel
sampling scales per tile are typically 24.7 cm/pixel when HiRISE is in 1 × 1 binning mode, and
the seasons 2 and 3 observations span binning resolutions of 1 × 1 to 4 × 4. For the seasons 2
and 3 monitoring campaign, a HiRISE image is associated with 36 to 635 tiles (see Table 2). For
the analysis presented here 23,723 tiles derived from 129 full frame HiRISE season 2 monitoring
images and 19,181 tiles derived from 92 season 3 HiRISE images were reviewed by Planet Four
volunteers. A characteristic sample of Planet Four tiles is presented in Figures 2 and 3.
3 http://hirise-pds.lpl.arizona.edu/PDS/
5
-73.53
-74.22
-81.38
-81.46
-81.68
-81.80
-81.93
-81.9
-82.2
-82.3
-82.5
-82.69
-83.2
-84.82
-85.0
-85.02
-85.13
-85.18
-85.4
-86.25
-86.39
-86.8
-86.98
-86.99
-87.0
-87.0
-87.0
-87.3
339.5
168.5
295.8
296.3
66.3
76.1
60.4
4.8
225.2
306
80.0
273.1
158.4
65.7
95.0
259.0
180.7
92.0
103.9
99.0
99.0
178.0
169.7
99.1
72.3
86.4
127.3
167.8
Binghamton
Caterpillar
Inca City
Inca City Ridges
Potsdam
Starburst
Albany
Buenos Aires
Wellington
Taichung
Buffalo
Cortland
Rochester
Giza
Schenectady
Troy
Ithaca
Geneseo
Macclesfield
Manhattan Cracks
Manhattan Classic
Písaq
Atka
Manhattan Frontinella
Halifax
Oswego edge
Bilbao
Portsmouth
# of Images
MY 30
2
1
7
7
7
7
5
7
2
1
2
1
4
11
1
1
10
0
7
1
8
3
3
5
3
6
7
5
0
0
7
8
9
3
0
7
0
0
0
0
0
7
0
0
6
1
7
5
9
1
0
3
0
10
3
6
cl
ei
rti
A
# of Images
MY 29
es
Informal Name
Pr
Longitude
(degrees East)
n
Latitude
(degrees)
Table 1: Regions of interest studied with Planet Four that were monitored during both seasons 2 (Mars Year 29) and 3
(Mars Year 30) HiRISE Southern Seasonal Processes Campaign. A full list of the images is available as supplemental
data in the file P4_catalog_v1.0_metadata.csv The Latitude and Longitude values are the mean value over the
center latitudes and longitudes of the respective HiRISE observations. All informal names are internal designations
used by the Planet Four team and not approved by the International Astronomical Union.
6
es
Pr
n
cl
ei
rti
A
Figure 2: Randomly selected sample of Planet Four tiles characteristic of the season 2 and season 3 HiRISE monitoring campaign. Each tile has 840 × 648 pixels, but its ground resolution varies with HiRISE binning modes. This is
reflected in the map_scale column of the Planet Four catalog files.
7
es
Pr
n
cl
ei
rti
A
Figure 3: Randomly selected sample of Planet Four tiles characteristic of the season 2 and season 3 HiRISE monitoring campaign. Each tile has 840 × 648 pixels, but its ground resolution varies with HiRISE binning modes. This is
reflected in the map_scale column of the Planet Four catalog files.
8
A
es
Pr
n
rti
cl
ei
Figure 4: Map overview of the regions of interest for the seasonal monitoring campaign of HiRISE. For readability,
the following regions are shown as cyan-colored unlabeled dots: Inca City Ridges, Schenectady, Troy, Manhattan
Cracks, Manhattan Classic, Atka, Halifax, Oswego edge.
Figure 5: Temporal and latitude coverage for the season 2 and season 3 HiRISE monitoring campaign observations
reviewed on Planet Four.
9
Longitude
[deg east]
Ls
[deg]
Start Time
ESP_011296_0975
ESP_011341_0980
ESP_011348_0950
ESP_011350_0945
ESP_011351_0945
ESP_011370_0980
ESP_011394_0935
ESP_011403_0945
ESP_011404_0945
ESP_011406_0945
ESP_011407_0945
ESP_011408_0930
ESP_011413_0970
ESP_011420_0930
ESP_011422_0930
ESP_011431_0930
ESP_011447_0950
ESP_011448_0950
-82.197
-81.797
-85.043
-85.216
-85.216
-81.925
-86.392
-85.239
-85.236
-85.409
-85.407
-87.019
-82.699
-87.009
-87.041
-86.842
-84.805
-84.806
225.253
76.13
259.094
181.415
181.548
4.813
99.068
181.038
181.105
103.924
103.983
86.559
273.129
127.317
72.356
178.244
65.713
65.772
178.8
180.8
181.1
181.2
181.2
182.1
183.1
183.5
183.6
183.7
183.7
183.8
184.0
184.3
184.4
184.8
185.5
185.6
2008-12-23
2008-12-27
2008-12-27
2008-12-27
2008-12-27
2008-12-29
2008-12-31
2009-01-01
2009-01-01
2009-01-01
2009-01-01
2009-01-01
2009-01-01
2009-01-02
2009-01-02
2009-01-03
2009-01-04
2009-01-04
110.6
110.2
123.6
99.7
128.0
110.6
139.4
106.5
134.1
111.3
138.8
148.9
112.8
157.3
157.0
148.6
113.0
138.8
91
126
91
126
91
126
72
164
91
126
91
59
108
54
54
54
218
59
n
cl
ei
Table 2: Partial table of used HiRISE observations to indicate spatial and temporal coverage. Full table published in
the online version. The center coordinates for all HiRISE pointings used in this study. Latitudes are planeto-centric
and the given north azimuth angle is for the non-map-projected data that went into the Planet Four system.
3. Planet Four
Here we describe the Planet Four classification interface and the information generated by
volunteers visiting the Planet Four website.
3.1. Classification Web Interface
Planet Four volunteers are asked to identify and outline fans in the presented tiles. Sometimes
the fan has an indeterminate direction, in which case we call them “blotches”. Although less useful for wind regime studies the blotches are sites where the ice has ruptured and released material,
so they are important to studying the sublimation process of the polar CO2 ice sheet. Thus, volunteers are asked to identify and mark blotches as well. Positions, orientations, and sizes of fans
and blotches are obtained via a web interface (see Figure 6) built upon the Zooniverse’s Application Programming Interface (API), which communicates with their custom built Ouroboros web
platform (described in Appendix A). Each tile is assessed by approximately 30–100 independent
reviewers. To ensure reviewers have no prior information that may influence their judgment, tiles
are randomly served to the classifier, and no identifying information about the parent HiRISE image is presented in the Planet Four web interface. The volunteer is blind to the location on the
South Pole, time of season the observation was taken, and responses from other classifiers while
10
rti
A
North
# of
Azimuth Tiles
es
Latitude
[deg]
Pr
Observation ID
es
reviewing a given tile. Planet Four was launched originally in English; later on the websites, classification interface, and help material have also been translated into several languages , including
traditional and simplified character Chinese, German, and Magyar (Hungarian). For the analyses
presented here, all Planet Four classifications are treated the same, regardless of what language the
volunteer was using in the classification web interface.
Pr
3.1.1. Tutorial
First time visitors to the Plant Four website are presented with a short inline interactive tutorial
that explains the task and guides the classifier on how to use the marking tools. Additional training
material is also available elsewhere on the site. The tutorial is shown only once for those classifiers
using the Planet Four web interface logged-in with a registered Zooniverse account. Volunteers
using the site in the non-logged-in mode, are presented with the tutorial each time they visit the
Planet Four website. Other than the frequency of the tutorial appearing, the user experience on
Planet Four, including the tutorial content, are exactly the same for logged-in or non-logged in
volunteers.
A
rti
cl
ei
n
3.1.2. Marking Tools
Fans and blotches are drawn by selecting the appropriate tool in the classification interface (see
Figure 6), clicking on the tile displayed, and dragging to resize the marker to the appropriate shape
and orientation. The fan tool generates a triangle with a rounded base with the user controlling
the endpoint of the fan. The default opening angle for the fan marker is set to 5°. The blotch
tool simply produces an ellipse with the user controlling the size and orientation of the major axis.
For blotches, the default length of the minor axis is 0.75 times the pixel length of the major axis
drawn. Once a blotch or fan marking has been made, a classifier can edit the initial parameters
by manipulating handles on the marker. For blotches, the length of the major and minor axes and
rotation can be adjusted. For fans, the opening angle, orientation, and length can be modified. If
only a single mouse click is made on the interface, than the minimum sized fan or blotch marker is
produced: a fan with a length of 10 pixels and an opening angle of 1° or an ellipse with both axes
equal to 10 pixels. Additionally, there is an ‘Interesting Feature’ tool available for volunteers to
highlight the position of anything that they deem worth review by the Planet Four Science Team.
The Interesting Feature marker is not resizable. All markers drawn in the web interface can be
repositioned or removed by the classifier.
11
es
Pr
n
cl
ei
rti
A
Figure 6: The fan (above) and blotch (below) marker on the Planet Four tutorial image. Black circles and diamonds
are the marker handles that can be used to adjust the shape and orientation in the web classification interface. The “x”
is used to delete the marker.
12
cl
ei
n
Pr
es
3.2. Classification Database
Once the volunteer is done making markings, if any, and hits the ‘Finished’ button, the classification (which we define as the sum total of all the markings or lack of markings made by the
volunteer) is submitted to the Ouroboros API to be saved to a database. At this point, the classifier
can move on to view the next tile by hitting the ‘Next’ button or can choose instead to enter the
Planet Four discussion tool (discussed in further detail in Section 3.3). Once the classification has
been submitted, it cannot be revised. For blotches, the center position, rotation angle, and pixel
lengths of the major and minor axes of the ellipse are recorded. For fans, the starting position, distance in pixels from the starting point to the end of the fan, opening angle, and rotation angle are
saved to the database. For interesting features, only the pixel location is stored. If no features are
marked, the database records the classification as a non-marking. A tile identifier and timestamp
for each classification is also stored in the database.
If the volunteer is logged in with a registered Zooniverse account, the classifications are tracked
in the database via the associated username. For non-logged-in classifications, a unique session id
is generated and used to link the classifications completed by a given IP address and web browser.
The non-logged-in identifier does not exactly correspond one-to-one to a unique individual. If a
person classifiers non-logged-in and changes their IP address, their new classifications would be
stored under a different identifier. Additionally, if a volunteer initially participates as a non-loggedin classifier on Planet Four and then registers for a Zooniverse account, the previous classifications
stored in the database are not linked to the Zooniverse username and remain associated with the
unique non-logged-in session identifier.
We note there are occasional spurious or duplicate entries stored in the classification database,
typically due to a glitch in the classifiers’ browser or a minor bug in the Ourborous framework.
These entries compose a very small percentage of the total volunteer classifications. They are easily identified and removed from the analysis presented here. Further details are provided in Appendix B. Additionally the Planet Four classification interface originally recorded a different angle
than the intended spread angle from the fan marking tool. This was identified and subsequently
fixed in the software. The true spread angle of the fan marker drawn by the volunteers is recoverable from the values stored recorded in the database, and we have adjusted the classifications
effected.
A
rti
3.3. Talk Discussion Tool
Associated with the Planet Four classification interface is a dedicated object-orientated discussion tool known as “Talk”4 . Each Planet Four tile assessed on the main classification interface
has a dedicated page on the Planet Four Talk website. Volunteers can access these pages directly
through the classification interface after submitting their classification. With Talk, volunteers can
write comments, add searchable Twitter-like hash tags, create longer side discussions, and group
similar tiles together in collections. For the analysis presented here, we focus strictly on the volunteer markings from the main user interface, and do not include a complete analysis of the data
from the Talk tool.
4 http://talk.planetfour.org
13
es
Pr
n
Figure 7: Distribution of the number of Planet Four classifications for Season 2 (MY29) and Season 3 (MY30) tiles
with a bin size of 5. The distribution peaks at the two different retirement values of 100 and 30. Due to performance
issues in the webserver’s queueing system, the retirement values were at times not enforced, leading to the spread-out
distributions at values higher than the retirement values.
A
rti
cl
ei
3.4. Site History
Planet Four was publicly launched on 2013 Jan 8 as part of the British Broadcasting Corporation’s (BBC) Stargazing Live, three nights of live astronomy programing (2013 Jan 8–10) on BBC
Two in the United Kingdom. Review of Season 2 and 3 tiles span from January 2013 to March
2015 with 9,809,637 classifications produced in total. The majority of classifications for Seasons
2 and 3 were obtained during the BBC Stargazing period, but subsequently data from HiRISE’s
other seasonal monitoring campaigns were mixed with the Season 2 and Season 3 classifications.
The results from data outside season 2 and 3 which are still in the process of being reviewed on the
Planet Four website will be the topic of subsequent publications. Figure 7 plots the distribution
of classifications per tile for Seasons 2 and 3. Due to the high classification rate at launch, tiles
were set to retire from rotation in the web interface after 100 independent assessments (counting
duplicates) to ensure that the project would continue to serve data over the Stargazing period. Over
time the classification rate dropped significantly from launch, and on 2013 Dec 9 the retirement
threshold for a tile was lowered to a more reasonable — and statistically acceptable — value of
30 to better accommodate the actual work rate on Planet Four. This value is similar to the image
retirement threshold that was used by the Zooniverse’s Milky Way Project [Simpson et al., 2012],
which enlists the general public in a similar task, drawing circles on space-based infrared images
to identify the shape and size of star formation bubbles.
3.5. User Statistics
36,433 registered volunteers and 48,094 non-logged-in sessions have classified at least one
tile in our MY29/30 data-set. Volunteers made in total 9,461,062 classifications with a median
14
A
rti
cl
ei
n
Pr
es
of 7 and average of 41 classifications per registered volunteer/non-logged-in session. The highest
number of different classifications (i.e. submitted Planet Four tiles) by the same volunteer was
31,808. After clean-up, Planet Four volunteers drew a combined 3,460,056 blotches, 2,694,415
fans, and 805,903 interesting features. Figure 8 shows the distribution of volunteer classifications
for Seasons 2 and 3 tiles combined. Individual registered volunteers (median of 14 and average
of 69 classifications per user) tend to contribute slightly more classifications than a individual
non-logged in session (median of 4 and average of 21 classifications per session). A given volunteer/session reviews only a small percentage of the entire sample of HiRISE tiles. Only 15 % of
classifiers (12,483 registered volunteers and non-logged-in sessions) have contributed more than
50 classifications. Most volunteers contribute a few classifications of Planet Four tiles before leaving the site. This is a typical response for web-based projects [Crowston and Fagnot, 2008; Zachte,
2012] and is similar to the volunteer behavior found on other Zooniverse projects [Sauermann and
Franzoni, 2015].
15
A
rti
cl
ei
b)
n
Pr
es
a)
Figure 8: Distribution of volunteer classifications. Figure a shows the combined distribution tallied together for both
logged-in and non-logged in sessions. Figure b shows the volunteer classification count individually for registered
and non-logged volunteers. Both histograms use a bin size of 2.
16
4. Data reduction
cl
ei
n
Pr
es
In order to create fan and blotch object catalogs from the Planet Four markings, a reduction
pipeline was implemented, for which the code is open source and made available5 . The pipeline
is based on the Python programming language, interfacing also to the US Geological Survey’s
(USGS) Integrated Software for Imagers and Spectrometers (ISIS) [Anderson et al., 2004; Becker
et al., 2007], and making use of the “scikit-learn” package for machine-learning related tasks [Pedregosa et al., 2011]. This data reduction pipeline has five main conceptual stages (see Fig. 9):
Cleanup, where the Planet Four classification data is cleaned, normalized and converted to a binary database (Section 4.1), Clustering, where the markings of the many different volunteers are
being combined into, ideally, one resulting average object (Section 4.2), Combination, where we
combine fans and blotches markings that seem to address the same visible object in the image
into a meta-object for further processing during the next stage (Section 4.3), Thresholding, where
a cut on the required number of volunteers that voted for either fan or blotch will decide if the
previously created meta-object should be considered a fan or a blotch (Section 4.3.1), and finally
Ground Projection, where we project the HiRISE image pixel coordinates of the resulting fan and
blotch markings into latitude and longitude coordinates on Mars (Section 4.4).
pipeline is located at https://github.com/michaelaye/planet4.
A
rti
5 The
Database
Cleanup
Clustering
per fans/blotches
Fan & Blotch
overlapping?
yes
Create metaobject with
marking weights
no
Final fan/blotch
catalog
Ground
Projection
Thresholding
decides between
fan and blotch
Figure 9: Overview of conceptual steps of the Planet Four data reduction pipeline.
17
Pr
es
4.1. Database Cleanup
After the removal of the tutorial data (see 3.1.1), and a first cleaning for spurious, incomplete
and duplicate classification database entries (see Section Appendix B), we normalize all angles
from the Planet Four classification interface, and finally produce a binary database in the format of
HDF5 (Hierarchical Data Format, version 5) for the remainder of the data processing. Normalizing
of angles is required because the Planet Four system records blotches with an angular range from
-180 to 180 while ellipses possess a degree-2 rotational symmetry. This means only the range of 0
to 180 degrees is required to fully describe blotches, once the radii are sorted in a consistent way
(semi-major axis first). Volunteers randomly start to draw the ellipses required to mark blotches
either from the semi-minor axis or the semi-major axis, making it error-prone to cluster on these
parameters without normalization. The cleaned raw Planet Four classifications as used by this
work’s analyses are provided as supplemental data to this work in the file P4_catalog_v1.0_raw_
classifications.csv. Further details about the format of the raw classifications are described
in Appendix C.
A
rti
cl
ei
n
4.2. Clustering
We identify fans and blotches by combining together the multiple volunteer assessments from
each Planet Four tile. To identify and precisely locate the marked features from the multiple
classifications performed by many (between 30–100, see Appendix A) volunteers per Planet Four
tile, we perform a clustering analysis on the data. Figure 10 shows an example of fan markings
for a Planet Four tile. After having evaluated several different clustering algorithms, we have
identified the Density-based Spatial Clustering of Applications with Noise (DBSCAN) clustering
algorithm of Ester et al. [1996] as the most appropriate one for our application. DBSCAN has the
advantage of not requiring the number of expected clusters as input, instead it is controlled by two
input parameters describing the minimum number of members of a cluster (min_samples) and
the maximum distance for a data point to be included into a cluster (epsilon). (Details on how
we determine these parameters are described in Section 4.2.1.) We set up our clustering pipeline
using the DBSCAN implementation in the scikit-learn Python library [Pedregosa et al., 2011]. All
volunteer responses are treated the same with equal weight in the clustering algorithm. Due to the
differences in the classification interface for marking fans and ellipse-shaped blotches — fans are
drawn from a base point vs blotches drawn from the center — the fans and blotch markings are
clustered separately at this stage, and require their own set of clustering parameters.
In a first stage, we cluster the data for Planet Four tiles each on the (x,y)-pixel-coordinates of
the base point of fans and of the center for blotches (see Fig. 12 for a visual description of the available coordinates of the markings.). Figure 13 shows the result of clustering in two dimensions of
the x and y base coordinates of the fan markings, using a multi-step approach as shown in Fig. 11,
as described below. Once the clusters for a given set of parameters (see Section 4.2.1 for details
on the parameter tuning) have been defined, the original marking data for each cluster members
are averaged to create one average marking object per cluster, including average directions for fan
objects, e.g. in Fig. 13. The number of markings that went into the creation of the averaged object
is stored for later.
After having clustered both fans and blotches on their base and center coordinates respectively,
we apply a second stage of clustering on the markings. For fan deposits, the major objective of this
18
es
Pr
cl
ei
n
Figure 10: Fan markings for Planet Four tile APF00001cl of HiRISE image ESP_012322_0985. Left: The cut-out
tile that is shown to the Planet Four volunteers. Right: 51 different users have classified this image. The colors cycle
through randomly for the markings of different users. With such a large number of different volunteers classifying,
the “sensitivity” for detection is increased, as notable by a few markings that outline even the smallest potential dark
deposit candidates. However, when the “crowd” does not agree with these, i.e. if the potential cluster does not reach
the min_samples number of required members, the clustering pipeline discards these entries, as shown in Fig. 13.
A
rti
Fan clustering
Blotch clustering
base coords
within 10 px
center coords
within 10 px
center coords
within 25 px
angle within
20º
rad_1 & rad_2
within 30 px
rad_1 & rad_2
within 50 px
Figure 11: The sequence of clustering steps for both fan and blotch markings. It became apparent during our studies,
that fan markings show less scatter, probably due to the tool having to be placed at a clearly identifiable base point.
Blotches, however, do not show a clearly identifiable center, and their outline is often less sharply defined, creating
a wider distribution of marking results, especially for larger blotches. This required a second run of clustering with
more relaxed cluster parameters, as described in Section 4.2.1 and in Table 3.
19
x
(0,0) Pixel position
Base Point (pixels)
y
Distance
(pixels)
Radius_1 (pixels)
n
Angle from horizontal
(degrees)
Pr
Radius_2
(pixels)
Center
Point
(pixels)
es
Spread
(degrees)
cl
ei
Figure 12: The different coordinates available in the Planet Four marking catalog are described here. Fans possess
(x, y) base coordinates, an angle from horizontal for their pointing and a spread angle. Blotches possess center (x, y)
coordinates, semi-major and minor axis radii and also an angle indicating their alignment towards the horizontal.
rti
A
Angle from horizontal
(degrees)
Figure 13: Fans from Figure 10 for Planet Four subject ID APF00001cl after applying our clustering pipeline. Left:
For direct comparison, this shows the same as Fig. 10 on the right, on page 19. Right: Results after clustering,
identification of noise markings, and averaging the cluster members’ data into one object per cluster. Markings that
do not become member of a cluster are defined as noise and will be discarded from further processing (shown as white
dots).
20
es
Pr
cl
ei
n
Figure 14: Planet Four tile APF0000de3 from HiRISE image ESP_011961_0935. It shows the prevalence and
precise identification of CO2 jet deposits with multiple directions that start from the same base point, indicating
multiple eruptions under different wind directions. The large fan is the second longest recorded in the catalog, with a
length of approx. 368 m.
A
rti
work is to determine the wind direction they indicate. Due to this we want to be able to distinguish
between different wind directions from the same source point, i.e. multiple subsequent eruptions,
where later eruptions occurred with a different prevalent wind direction. In the Planet Four help
content we have emphasized that the volunteers should outline several fans if they appear to start
from the same source point. This is very relevant for data like that in Fig. 14, to identify several
wind directions indicated by the fans, from multiple subsequent jet eruptions. By clustering not
only on the base coordinates (x, y) but also on the recorded alignment angle of the fan markings,
we are able to distinguish these subsequent fan deposits with different wind directions.
We have determined by reviewing the clustering results of a subset of the data that 20 degrees
as a clustering value for angles enables this objective. It means that fan markings that have an
alignment angles further away from each other than 20 degrees are clustered into their own subcluster, even if they start at the same base point. Blotches, on the other hand, are used for deposits
that do not clearly indicate a direction, which is why we do not apply an angle clustering here.
However, blotches do not show a clearly identifiable center, and their outline is often less sharply
defined, creating a wider distribution of marking results, especially for larger blotches. Thus, we
cluster also on the resulting ellipse radii for the blotches to ensure that we identify the statistically
most common shape of the volunteer’s blotch markings.
The values of the clustering parameters strongly influence the number of identified features.
We therefore studied extensively, how precisely they affect our results by reviewing random subsets of the data-set, which led to the empirical determination of the clustering parameter values
21
Fans
Blotches
xy (base)
angle (deg)
xy (center)
radius (px)
10 px
20
10 px
30 px
NA
NA
25 px
50 px
es
Marking Dimension Small Large
Pr
Table 3: Empirically determined epsilon values for the clustering pipeline. NA: Fan markings did not require a
second clustering run with relaxed precision on the distance, apparently the fact that a fan requires drawing from
a distinguishable starting point helped the volunteers to keep the scatter small, both in base coordinates and angle
precision.
that we eventually used for the catalog production. These procedures will now be discussed in
the following sections (see Fig. 15 for an example of reviewing parameter values). The results of
the clustering stage are then shown in the lower right (blotches) and lower middle (fans) parts of
Figures 16 and 17.
cl
ei
n
4.2.1. Cluster parameters
min_ samples . As described in Section 3.2, Planet Four tiles have varying numbers of user classifications, thus the classifications for each Planet Four tile are clustered separately, with a variable
requirement on the min_samples clustering parameter. More classifications for a Planet Four tile
means that we have a higher “sensitivity” to smaller features (see for example Fig. 10, right), so
to achieve a uniform detection efficiency, we implement a scaling factor on the required number
of samples per cluster. This results both in a higher sensitivity to have seasonal fans and blotches
marked and higher precision averaged objects at the end of the clustering process. In other words,
the signal-to-noise ratio (SNR) is higher for a Planet Four tile that was classified by a larger number
of volunteers and we adapted the clustering process to normalize for that fact.
To address the variable SNR in our data, we empirically determined a scaling factor min_
samples_factor (MSF) that, multiplied with the number of classifications that contain blotch or
fan markings, results in the min_samples value for the DBSCAN algorithm:
min_samples = round min_samples_factor · nmarkings ,
A
rti
with nmarkings ≤ nclassifications , the number of classifiers that have added either blotch or fan
markings as classifications.
The best value for MSF was empirically found to be at 0.13. For example, when a Planet Four
tile has nclass = 30 classifications (our current retirement value), nclass will be 4. This value now
provides the number of cluster members min_samples that is required for a cluster to be created.
When a tile has 70 submissions, however, it would result in the requirement of having 9 cluster
members to be deemed a real detection and to be entered into the next stage of the pipeline. This
way, we are exploiting the higher sensitivity from the larger number of submitted classifications.
epsilon . The second DBSCAN parameter, epsilon, describes the largest distance that two
points are allowed to have, for them to be considered to be in the same cluster. The dimension
for this measurement depends on what mathematical feature is currently being clustered. When
22
A
rti
cl
ei
n
Pr
es
we cluster on the base point coordinates of fans, the central point coordinates or semi-radii of
blotches, the feature space is measured in pixels, while fan angles are clustered in degrees. The
size scale of the dark fans and blotches varies significantly between different regions of interest at
the south pole of Mars. Trying to cluster our data with only one value of epsilon, we realized that
it was not possible to simultaneously resolve small markings on the order of 20 pixels properly
that were precisely positioned by the volunteers, while also clustering successfully markings of
much larger deposits that could stretch more than half of the Planet Four tile that was shown to the
volunteers. The spread in marking coordinates is smaller for smaller features — we think because
of an increased focus to detail for smaller features —, and thus, to ensure identification of large
features, we implemented a second stage of clustering with larger allowed values for epsilon.
The resulting values in Table 3 were selected empirically after review of a random subset of the
pipeline output. Fig. 15 shows an example parameter scan review graphic that the science team
used to determine the parameter values that work best for our task.
23
A
e
cl
rti
24
in
es
Pr
Figure 15: This figure shows our review plots for determining the best clustering parameters for Planet Four tile ID 1cl. In this example, we review
the fan clustering with a group of 2 different min_samples values, controlled by using a min_samples_factor of 0.1 and 0.13 respectively, leading to
min_samples values of 5 and 7. Additionally, we are scanning the epsilon (EPS) value for small deposits with the settings 10, 20, and 30 pixels, while the
EPS_LARGE value stays at 25 pixel for these runs (having no effect in this case due to the small size of markings). The upper left 3 plots are for the setting of
MSF=0.1 (resulting in a min_samples value of 5), and EPS between the 10, 20, and 30 pixel values. Then, the second group with an MSF of 0.13 (resulting
in min_samples=7), starts in the upper right with the fourth plot in the upper row, and continues in the lower left with the first two plots, again showing the
tests for EPS values 10, 20, and 30 pixels respectively. The last two plots in the lower row provide us with what the volunteers actually marked and what
they received as input for the markings, the Planet Four tile, cut out from the larger HiRISE images. The number of fans clustered varies significantly for
different clustering parameter values, with n between 11 and 16. We favor the setting in the upper right plot, for identifying correctly all small center fans,
while not creating an object for the small black spot at the top of the image tile.
A
e
cl
rti
25
in
Pr
Figure 16: This figure shows the final pipeline result of the tile from Fig. 15. Upper Left: The input tile; Upper Middle: Fan markings of the volunteers;
Upper Right: Blotch markings of the volunteers; Lower Right: Blotch markings after clustering and averaging the cluster members; Lower Middle: Fan
markings after clustering and averaging the cluster members; Lower Left: These are the final catalog entries. To reach this, the results from Lower Middle
and Lower Right are being compared, and the higher voted markings at comparable locations win. How high that winning ratio must be to be entering the
final catalog is determined by the threshold value (see Section 4.3.1). Note, how the center fans are cleanly identified and winning in the voting competition
with the blotch at the same location. The opposite is true for the the small object identified at the middle left, where a red blotch marking has won against
the small cyan fan.
es
cl
ei
n
with nfans and nblotches the number of volunteers that marked either. The fudge value 0.01 is
required to be able to make an either-or decision for the object when nfans = nblotches , flipping
the switch in this close call for fans instead of blotches, due to the usefulness of fans for further
scientific analysis.
We determine to which markings this procedure is applied by calculating the pair-wise Euclidean distance for all clustered objects and check if clusters are within a chosen limit of 30
pixels with each other. We chose this value for allowing slightly more imprecision in the markings’ positioning as the clustering algorithm that went into creating these average, but without
combining too many markings that really should be individual items. We have reviewed several
hundred subsets of data and determined 30 pixels to be a good compromise on these competing
tasks. If a distance pair meets the combination criterion, we use above formula to calculate P(fan)
for this pair of markings. This value goes from 0 to 1 with 0 being a definite blotch when n f ans = 0
and 1 indicating a definite fan when nblotch = 0, in other words either none or all volunteers had
drawn a fan or a blotch, respectively. We then create a meta-object for this pair, storing P(fan) under the name ‘vote_ratio’ in the catalog files, together with all other data for both objects. We do
this to enable future users of the catalog to decide on their own how reliably a marking is required
to be a fan before it shall be used as such, with its data entering a study. In other words, a specific
study might require to only use the most clear fan markings, maybe with a P(fan) of larger than
0.8. Applying such a cut is called Thresholding in our pipeline, described in the next section.
4.3.1. Thresholding
For concrete applications, e.g. for this publication, a scientist can now apply a cut on P(fan),
that will write out the decision to a new catalog file with fans and blotches. For example, a cut on
P(fan) of 0.8 would mean that all meta-objects with a value of smaller than 0.8 will be written out
as the underlying blotch, while for meta-objects with a value of larger than 0.8 the stored fan will
be written out. In both cases, the remaining data of the meta-object that was thresholded against
will be dropped for the newly created catalog file, but it is still available for other thresholding
operations as an intermediate data product. An example use case would be that a scientist wants to
study the sensitivity of their research on the applied cut, for example, if we want to provide wind
direction data to a mesoscale climate simulation, we might want to make sure that only the most
certain directions are being used and would apply a higher cut on the meta-object value.
26
rti
A
nfans + 0.01
,
nfans + nblotches
Pr
P (fan) =
es
4.3. Combination
When the direction of fan deposits are not very pronounced, i.e. the prevalent winds were weak
at the time of the jet eruption, there is ambiguity in identifying the deposit as a fan or a blotch. This
can result in a given ground source having both survived clusters of fan and blotch markings that
need to be combined in a strategic way to create a final object category for the observed ground
source that will be listed in the resulting object catalog. We make use of the relative frequency of
which marking tool was used to create both marking clusters to identify how fan-like a source is.
For example, if 5 people classified a marking as a fan, but 5 other people marked it as a blotch, we
assign a fan probability P(fan) of 0.51 by applying
A
rti
cl
ei
n
Pr
es
For the catalog that we deliver with this work, we chose a simple majority threshold of 0.5, so
that the catalog offers the broadest use case. Chosing simple majority means that we take a marking as a fan from the moment that at least an equal amount of volunteers have classified an object
as a fan and as a blotch. Catalog files with this applied P(fan) threshold of 0.5, all intermediate
data products, and instructions on how to apply a threshold for writing out new catalog files will
be provided as supplementary products (see Appendix D for more details).
27
es
Pr
n
cl
ei
rti
A
Figure 17: Three example Planet Four tile pipelines, for APF0000b0t, APF0000ops, and APF0000bk7. See Fig. 16
for a detailed description of the pipeline plotting sequence.
28
A
rti
cl
ei
n
Pr
es
4.4. Ground Projection
For each Planet Four tile, the clustering in volunteer-drawn markings to identify seasonal
sources is performed using the pixel positions of Planet Four tiles. Once the cluster dimensions
and position has been identified, the source’s true location on the South Pole must be calculated.
However, the HiRISE team-generated non-map projected color mosaics the Planet Four tiles are
derived from do not contain the spacecraft information necessary to compute the latitude and longitude per pixel. We partially reconstruct the mosaics from the raw HiRISE image products or
Experiment Data Records (EDRs) building a red filter only composite image with the necessary
spacecraft information required to perform coordinate transforms. The HiRISE EDRs were obtained from the NASA’s Planetary Data System (PDS) HiRISE PDS Data Node. We developed a
reduction pipeline in Python using the US Geological Survey’s (USGS) Integrated Software for
Imagers and Spectrometers (ISIS)6 [Anderson et al., 2004; Becker et al., 2007] and the ISIS-3
Python wrapper Pysis7 for this purpose.
We briefly summarize the steps as shown in Fig. 18 including the required ISIS-3 application
names, to generate the red filter-only mosaic. We start with the center two RED filter CCDs
(RED 4 and 5), each with two readout channels. All four EDR files (2 for each CCD) are read
in and converted to ISIS-3 cube format, and the SPICE (Spacecraft & Planetary ephemerides,
Instrument C-matrix and Event kernels) information for MRO is added to the EDR headers. For
each CCD, we combine the two channel EDRs into a single image. The combined image is
then normalized to remove both the striping and left/right normalization effects. This is not a
necessary step for obtaining map project information but makes it easier to visually inspect the
final combined mosaic. Once both CCDs have been reduced they are combined in a final mosaic
accounting for the 48 pixel (in 1×1 binning) overlap.
Once the single filter red mosaic is made, we are able to translate any fan and blotch pixel
position to latitude and longitude on the south pole using ISIS-3’s campt application. The catalog tables P4_catalag_v1.0_L1C_cut_0.5_fan_meta_merged.csv — and _blotch_meta_
merged.csv respectively —, provided as supplemental files include the cluster coordinates as latitude/longitude derived from this process, as well as a set of positional coordinates (X,Y,Z) in the
body-fixed reference frame for Mars, measured in kilometers.
6 http://isis.astrogeology.usgs.gov/
7 https://github.com/wtolson/Pysis
29
hi2isis
spiceinit
cubenorm
Create mosaic by
merging 2 remaining
center CCD images
Stitch channels,
creating 1 img
per CCD
histitch
handmos
Pr
Remove striping and
normalization probs
Translate fan and
blotch pixel positions
to lat/lon
campt
n
Ground coordinates
cl
ei
Figure 18: Process for creating single channel non-map projected mosaics with the required SPICE header information used to convert Planet Four feature pixel coordinates to geographical lat/lon coordinates. The required ISIS-3
applications for each stage are listed in the arrows.
4.5. Overlap regions
As previously mentioned in Section 2, to avoid edge effects, the cutting down of HiRISE images into screen-sized tiles is performed such that there is a 100-pixel overlap with the neighboring
tiles. This way, at least in one of the tiles of an area fans and blotches that cross the boundary between tiles will be visible completely. However, from our own Planet Four marking efforts and
from analyzing results from Planet Four volunteers, we have determined that the classification
tools do provide such high level of precision in placement, that many volunteers position and push
a fan or blotch marking out of bounds of the shown image area to make it fit a partially shown
fan or blotch. This results in several markings for the same object stemming from different Planet
Four tiles, as shown in Fig. 19. It can be seen in this figure that the directions of fans are matching,
despite the fact that some tiles only showed a small part of a fan in the overlap area. We hence
conclude that a wind direction analysis is not adversely affected by this analysis artefact. For a
future study focusing on area covered by markings and counts of fan and blotch activity, we will
implement a merging procedure to remove multiple markings, similar to the Combination step in
our pipeline, as described in Section 4.3.
rti
A
Add SPICE data
es
4 images
(2 center CCDs,
2 channels per CCD)
30
A
es
Pr
n
cl
ei
rti
Figure 19: Six neighboring Planet Four tiles of HiRISE image ESP_011931_0945 are merged in this plot. The tiles
have the following tile coordinates within the HiRISE image and Planet Four tile_ids: Upper Left: (1, 33),b1j; Upper
Right: (2, 33), b10; Middle Left: (1, 34), b0p; Middle Right: (2, 34), b20; Lower Left: (1, 35), b0t; Lower Right: (2,
35), b0a (all 3 letter tile_ids need to prepend ‘APF0000’ for the full ID). The shape of the tiles are distorted compared
to their displayed on-screen size for this plot. Each tile was clustered individually, indicated by the different marking
colors. The solid lines indicate where an unshared division between the tiles would lie, the dashed lines show the
overlap region that was added to each tile to maximize available information for the volunteers. This plot is instructive
in showing how the marked fans, specifically their directions match very well, despite the fact that sometimes only
a very small part of the whole fan marking was visible to the classifying volunteer. For increased precision in total
marking counts and the area covered by markings we will design an object merging procedure on these overlap regions
(next paper).
31
5. Data Validation
cl
ei
n
Pr
es
To date, there is no published catalog of the locations and numbers of seasonal defrosting
features for any of the HiRISE images of the Martian south polar region to compare to the Planet
Four results. In order to assess the accuracy and recall rate of Planet Four and confirm the majority
of fans and blotches present in the HiRISE observations are identified when combining multiple
classifier markings, we have created a ‘gold standard’ data-set based on expert assessment. Using
the same classification interface and markings tools on the Planet Four website as the citizen
scientists used, the Planet Four Science team reviewed a subsample of the Seasons 2 and 3 tiles
and produced a catalog of markings. Similar validation processes have been applied in analyses of
our previous Planet Four publication for the sister project Planet Four: Terrains [Schwamb et al.,
2017a] and to crater counting crowd-sourced data for the Moon [Bugiolacchi et al., 2016; Robbins
et al., 2014].
To generate the gold standard data-set, 960 Season 2 tiles and 767 Season 3 tiles were randomly
selected and equally divided amongst the three of the primary Planet Four Science Team members
(GP, KMA, MES) to review. This corresponds to 3 % of the tiles from each season classified on
Planet Four. Additionally another 192 tiles, both from Season 2 and 3, were randomly chosen
and classified by all science team gold standard classifiers in order to compare the science team
markings to each other. This corresponds to approximately 0.4 % of each season’s tiles. The Planet
Four tile_ids of the gold standard classifications and the user names of the science team members
that did the analysis are provided in supplemental data files P4_catalog_v1.0_gold_standard_
ids.zip.
Common Expert data vs Catalog
# of tiles
102
101
100
0
10
20
30
40
50
# of fans+blotches per Planet Four tile
rti
A
GP
MES
KMA
catalog
60
70
Figure 20: Comparing counts of identified objects (i.e. fans and blotches together) per Planet Four tile between
experts and the catalog data; here, for the 192 common tile_ids that were classified by all experts. Bin size is 5, each
bin is directly compared between the data from all experts GP (blue), MES (orange), KMA (grey) and the catalog
results (brown). Binning max was cut off at 75, omitting single entry bins above.
5.1. Counts of objects identified
We use the expert classifications from the science team with our final catalog in order to explore
how well fan and blotch features are identified and how accurately the shapes and dimensions are
represented in the Planet Four catalog. We show a tile-based comparison in Section Appendix F.1,
32
Expert vs Catalog object identification frequency
GP
catalog
101
100
0
25
50
75
125
101
100
0
25
50
75
100
125
150
175
KMA
catalog
n
102
101
cl
ei
# of tiles
175
Pr
# of tiles
150
MES
catalog
102
100
0
25
50
75
100
# of fans+blotches per Planet Four tile
125
150
175
Figure 21: Comparing counts of identified objects (i.e. fans and blotches together) per Planet Four tile between
experts and the catalog data. Bin size is 5, each bin is directly compared between data from experts (in dark blue)
and catalog data (in orange), with the experts GP, MES, and KMA respectively, from top to bottom. Each histogram
contains data for 432 tiles, with each expert classifying an independent data-set.
but first we examine the collective properties of the part of the Planet Four catalog that represents
the gold standard tiles. We compare and contrast these distributions to the expert classifications
together and per expert reviewer.
Figure 20 compares the number distribution of identified sources (i.e. fans + blotches) per
Planet Four tile between experts and the catalog data for the 192 common tiles that were commonly
classified by all three science team members (KMA, GP, MES). Among the expert classifiers
there are some visible differences especially where the interpretation of a single image or two
dominates the value of the histogram bin. The final catalog is within the variance of the individual
expert assessments. We can see this further in Figure 21 which shows the number distribution of
identified objects (i.e. fans and blotches together) per Planet Four tile when comparing the results
for the tiles that were only classified by one of the science team members. We note that even tiles
with 30 or 40 fans and/or blotches are still well represented in the catalog.
rti
A
100
es
# of tiles
102
33
5.2. Fan lengths and blotch areas
Fan lengths, common expert data vs catalog
101
100
0
100
200
300
Fan lengths [pixel]
500
Figure 22: Comparing measured fan lengths between experts and the catalog data; here, for the 192 tile_ids that were
classified by all experts. Bin size is 30, each bin is directly compared between the data from all experts GP (blue),
MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut off at 600, ommiting single entry
bins above.
cl
ei
n
We also use our expert gold standard classifications to examine the physical sizes and areal
coverage of the Planet Four catalog fans and blotches (see Figures 22 to 25). As in previous comparisons, there is good agreement. The differences between the catalog is within the the variance
seen between the individual expert classifiers. Differences between the catalog and experts become more apparent when in small number regimes (when <10 sources comprise the bin). These
differences between the distributions in these small sizes is consistent with small number Poisson
uncertainty on the histogram values [Kraft et al., 1991]. Thus, fan length and blotch areas are well
reflected in the Planet Four catalog.
rti
A
400
Pr
# of fans
102
es
GP
MES
KMA
catalog
34
101
100
100
300
102
101
cl
ei
103
0
100
200
300
500
400
500
KMA
catalog
102
101
100
0
100
200
300
Fan lengths [pixel]
400
500
Figure 23: Comparing measured fan lengths between experts and the catalog data. Bin size is 30, each bin is directly
compared between the data from all experts GP (blue), MES (orange), KMA (grey) and the catalog results (brown).
Binning max was cut off at 600, ommiting single entry bins above.
rti
A
400
MES
catalog
100
# of fans
200
n
# of fans
0
Pr
# of fans
GP
catalog
102
103
es
Fans lengths, expert vs catalog
103
35
Blotch area, common expert data vs catalog
GP
MES
KMA
catalog
102
101
100
0
10000
30000
40000
Blotch area [pixel**2]
50000
60000
70000
Pr
Figure 24: Comparing measured blotch areas between experts and the catalog data; here, for the 192 tile_ids that
were classified by all experts. Bin size is 5000, each bin is directly compared between the data from all experts GP
(blue), MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut off at 80,000, ommiting
single entry bins above.
GP
catalog
103
102
101
100
20000
103
40000
60000
cl
ei
# of blotches
0
n
# of blotches
Blotch area, expert vs catalog
80000
100000
120000
MES
catalog
102
101
100
# of blotches
0
20000
40000
60000
80000
100000
120000
KMA
catalog
103
102
101
rti
A
20000
es
# of blotches
103
100
0
20000
40000
60000
Blotch area [pixel**2]
80000
100000
120000
Figure 25: Comparing measured blotch areas between experts and the catalog data. Bin size is 5000, each bin is
directly compared between the data from all experts GP (blue), MES (orange), KMA (grey) and the catalog results
(brown). Binning max was cut off at 120,000, ommiting single entry bins above.
36
5.3. Wind direction comparison
Histogram of deltas between science team
and volunteer mean fan directions.
6
4
2
0
40
20
0
20
40
Delta mean wind direction per Planet Four tile
0
5
10
15
20
Fan angle standard deviation per cluster [deg]
n
Figure 26: Left: From the 192 tiles that were analyzed by the science team, 82 resulted in fan catalog entries. Of
those, we used 39 that had more than 3 fans, for better statistics (the median number of fans per tile is 4, see Section 6).
In this histogram, we show the difference between the mean angle of the fans in these 39 Planet Four tiles between
the science team and the volunteers. Overall, we have a good agreement, with a few rare outliers, discussed in the text
and in Figures 27 and 28. Bin size is 2. Right: Standard deviations (STDs) of the directions of fan markings that went
into each cluster, before they are merged into the average resulting catalog object. This plot shows the distribution of
these STDs for the set of 192 common gold tiles, which had a total amount of 904 fans. Bin size is 1.
cl
ei
Fig. 26, left, shows a histogram over the differences in the mean-over-tile fan directions between the catalog entries that are clustered from all the volunteers’ markings and the average from
the three science team members. In general, the agreement is very good, with differences usually
smaller than 10 degrees. Another way to investigate our uncertainties is to calculate the angular
standard deviation for each cluster member markings that are merged into the final catalog objects,
independent on if the markings were done by an expert or a volunteer. Fig. 27 discusses the lower
outlier of, indicating that the respective Planet Four tile has a more difficult than usual scenario
with a naturally occurring higher variance of the actual deposit directions on the ground. Not only
are the deposit shapes visible in the upper left more irregular than usual, there is a visible gradient
of directions across this tile, as can be seen by the exaggerated fan pointers. This gradient is probably caused by the basin shapes in the Inca City region that can create a topographical control of the
alignment of fan deposits over the usual wind control. However, our reduction pipeline is reliably
reducing the markings for every deposit, but with higher than usual variance between orientation
and size of the markings. Having no single clear fan direction in the image tile, it is reasonable to
expect a higher variance and hence, a higher delta when compared to the 3 science team members.
In a similar fashion, Fig. 28 discusses the high-side outlier of Fig. 26. While fans have been
identified, their counts is low, creating low statistics effects by letting small deviations having a
larger effect on the comparison with the catalog data. Additionally, the few fans that are visible
appear to show different directions, leading to a less certain fan direction with a higher variance,
which in turn can lead to larger differences when comparing their values, resulting from low
statistics.
rti
A
140
120
100
80
60
40
20
0
Pr
8
Bin Counts
Bin Counts
10
es
Histogram of angular STD for merged fan clusters
37
es
In Fig. 26, right, we plot the standard deviations for all 904 fan clusters for the 192 common
tiles that were analyzed by all experts. The right end of this histogram is cut off by our angular
clustering parameter of 20°, meaning larger angular differences are never clustered together. However, the majority of standard deviations lie far below that safety cut-off value for the clustering.
We estimate an average uncertainty for our fan directions of about (5 ± 3)°, using a half maximum
width of this histogram. The actual uncertainty highly depends on the quality of the data as given
by the HiRISE binning mode and the local variability of winds, leading to increased diffusion of
the deposits. We believe these factors lead to the non-Gaussian skew of the histogram.
Additional validation results can be found in Appendix F.
A
rti
cl
ei
n
Pr
5.4. Summary
In conclusion, our catalog has high completion in most cases. Outliers have been found to
be caused by special circumstances with more challenging classification tasks, creating higher
variance for all classifiers, including the experts. The analysis of the gold standard sample demonstrates that the bulk composition of the Planet Four catalog represents a fairly complete picture of
the seasonal fans and blotches captured in the HiRISE images.
38
A
es
Pr
rti
cl
ei
n
Figure 27: One of the outliers of Fig. 26, Planet Four tile ID APF00002aj of HiRISE image ESP_012744_0985.
The input image shows deposit shapes with less pronounced boundaries, leaking into the background. There is also
a visible gradient of directions across the tile (visible through the extended fan pointers). See the text for more
interpretation.
Figure 28: The highest outlier from Fig. 26, Planet Four tile ID APF0000c0t, from HiRISE image ESP_012858_
0855. While fans have been identified, their number is small, increasing the chance for variance between the experts
and catalog data.
39
6. Results: Fan and Blotch Catalog
n
Pr
es
From 221 HiRISE images from Mars years 29 and 30, cut up into 42,904 Planet Four tiles,
the Planet Four volunteers produced almost 2.8 million fan markings, that were clustered into
159,558 fans in our MY29/MY30 catalog. In Table 4 we show an example of fan catalog data.
For blotches, 3.46 million raw markings were combined into 250,164 blotches. 29.6 % of the
image tiles (= 12,693) end up not having any clustered markings in our catalog. Fig. 29 shows
the distribution of the fraction of empty tiles per HiRISE image vs. solar longitude. Visual checks
of data with fractions above 0.8 confirmed that these HiRISE images are mostly free of CO2 jet
deposits at spring times; in late summer, however, when the seasonal CO2 ice layer has fully
sublimated,fan and blotch deposits are rendered mostly invisible, because they blend into the now
ice-free background. A notable exception to this general effect is the ROI Inca City where the
summer data, after Ls 250◦ –260◦ , regularly shows fan deposits still discernible. This could point
to an interesting difference in the ground soil compactification and its related observed texture.
New deposits from CO2 jet eruptions may be sufficiently different in texture from the background
as a result from particle sorting and related phase function changes of the fresher surface.
Distribution of empty tiles vs time
0.8
0.6
0.4
0.2
A
rti
Fraction of empty tiles per HiRISE image
cl
ei
1.0
0.0
180
200
220
240
260
Solar Longitude [ ]
280
300
Figure 29: Distribution of empty tiles over time, measured in Mars Solar Longitude. Until Ls =260° the fraction of
HiRISE images that can be empty varies randomly, reflecting the different ground surfaces imaged across all latitudes.
After Ls =260° all CO2 is gone — earlier at lower latitudes —, and most of the HiRISE images appear empty in terms
of identifiable blotches or fans, because any deposits blend with the ice-free background.
40
A
angle distance tile_id
205.56
185.39
184.98
184.29
189.42
194.16
187.74
209.47
199.91
218.88
179.71
179.62
500.27
105.43
109.50
335.78
183.41
179.29
220.64
118.16
2270.76
3391.21
3509.96
3716.27
3452.17
3565.47
3143.15
942.95
1199.11
815.95
24336.16
5640.60
5876.70
5824.50
6033.00
15930.34
15433.60
22257.99
21994.01
22539.28
41
-0.43
-0.09
-0.09
-0.07
-0.16
-0.24
-0.13
-0.49
-0.34
-0.62
35
15
10
6
3
64
20
58
54
42
spread version vote_ratio
ESP_012079_0945
ESP_012079_0945
ESP_012079_0945
ESP_012079_0945
ESP_012079_0945
ESP_012079_0945
ESP_012079_0945
ESP_012079_0945
ESP_012079_0945
ESP_012079_0945
88.03
21.35
18.91
26.41
22.58
34.93
25.68
49.11
35.37
49.66
1
1
1
1
1
1
1
1
1
1
1.00
1.00
1.00
0.68
0.51
1.00
1.00
1.00
1.00
1.00
x x_angle
790.76
431.21
549.96
756.27
492.17
605.47
183.15
202.95
459.11
75.95
-0.90
-1.00
-1.00
-1.00
-0.99
-0.97
-0.99
-0.87
-0.94
-0.77
l_s north_azimuth map_scale BodyFixedCoordinateX BodyFixedCoordinateY BodyFixedCoordinateZ
214.785
214.785
214.785
214.785
214.785
214.785
214.785
214.785
214.785
214.785
126.856883
126.856883
126.856883
126.856883
126.856883
126.856883
126.856883
126.856883
126.856883
126.856883
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
-65.804336
-67.219114
-67.170611
-67.127761
-67.169940
-66.258570
-66.400170
-66.296391
-66.261274
-66.300167
261.407884
257.011589
257.055226
257.024926
257.096267
259.361039
259.284370
261.048812
260.965240
261.124709
in
224.16
160.60
396.70
344.50
553.00
586.34
89.60
337.99
74.01
619.28
F000000
F000001
F000002
F000004
F000005
F000006
F000007
F000008
F000009
F00000a
e
cl
y y_angle
0
1
2
3
4
5
6
7
8
9
APF0000ci9
APF0000cia
APF0000cia
APF0000cia
APF0000cia
APF0000cib
APF0000cib
APF0000cid
APF0000cid
APF0000cid
rti
0
1
2
3
4
5
6
7
8
9
image_x image_y marking_id n_votes obsid
-3370.504345
-3370.631413
-3370.630794
-3370.635002
-3370.628302
-3370.571273
-3370.565666
-3370.492211
-3370.497183
-3370.487589
PlanetocentricLatitude PlanetographicLatitude PositiveEast360Longitude
-85.427383
-85.493546
-85.493039
-85.493723
-85.492368
-85.459101
-85.459755
-85.431209
-85.432730
-85.429945
-85.480830
-85.546226
-85.545725
-85.546401
-85.545061
-85.512180
-85.512827
-85.484612
-85.486115
-85.483362
104.129523
104.656897
104.644396
104.637107
104.642019
104.330752
104.364183
104.249678
104.246813
104.246483
Pr
0
1
2
3
4
5
6
7
8
9
es
Table 4: First ten lines of the fan catalog file P4_catalog_v1.0_L1C_cut_0.5_fan_meta_merged.csv, broken into three segments.
es
Pr
n
cl
ei
rti
A
Figure 30: Planet Four tile APF00006mr from HiRISE image ESP_011296_0975 has the highest number of resulting
fan entries per tile. Top: Input tile as seen by volunteers; Bottom: Overlaid clustering results from the catalog.
42
es
Pr
n
cl
ei
rti
A
Figure 31: Planet Four tile APF00007t9 from HiRISE image ESP_012604_0965 has the highest number of resulting
blotch entries per tile. Top: Input tile as seen by volunteers; Bottom: Overlaid clustering results from the catalog.
43
es
6.1. Catalog properties
6.1.1. Fan counts
The highest counts of fans and blotches were 167 fans in the tile_id APF00006mr and 278
blotches in the tile_id APF00007t9, shown in Figures 30 and 31. These data serve as an indication
of the dedication of the Planet Four volunteers producing results in such high spatial density. The
median count of fans and blotches per tile is 4. The distribution of both numbers is shown in
Fig. 32.
A
rti
cl
ei
n
Pr
6.1.2. Fan lengths
As an example of the possibilities of the produced catalog, we describe the measured fan
lengths in the catalog. The catalog column distance requires scaling by the values in map_scale,
to correct for the different HiRISE binning modes. The distribution of these measurements are
shown in Fig. 33. About 97 % of all fans are below 100 m in length, with a median value of 24 m.
The three largest fans measured are all from the same ROI called Manhattan Classic (Lat
−86.39°, Lon 99°), having lengths of 373 m, 368 m and 361 m respectively. They were identified
in the HiRISE images ESP_013095_0935 (longest) and ESP_011961_0935 (second and third).
The two longest fan markings even identify the same fan, but at different times in the season,
with the longest observed at Ls =265°, and it’s shorter self at Ls =209°. Being only 5 m different,
we attribute the increased marking measure to both material being potentially moved around by
winds during spring and a decrease of precision in identification after the CO2 has sublimed and
the deposits start to fade into the background. However, we interpret the fact to have identified
the largest fan twice, as a further indication of the high reliability of our results, considering
that the random image serving procedure of the Planet Four classification interface ensured that
volunteers do not classify images in the order they have been taken, because that would have
increased the chances of being biased by their previous classification. In this case, where 119
volunteers classified APF0000dtk with the longest fan, and 54 volunteers classified APF0000de3
with the second longest fan (shown in Fig.14), only one volunteer was identified to be the same.
An overview of the fan lengths distributions for all major ROIs over all 2 Martian years of data
is shown in Fig. 34. When compared between Mars years 29 and 30, the total (over all ROIs) fan
length statistics are very comparable, with a median of 24.2 m for MY29 and 23.8 m for MY30.
However, we identify specific ROIs that have different fan properties between MY29 and 30. For
example, the ROI Manhattan has a median fan length of 42 m in MY29 and a decreased median of
25 m in MY30. This is in contrast with the ROI Giza, where the trend has the opposite direction,
with a median fan length of 44 m in MY29, comparable with Manhattan’s median in the same
year, but then increases to a median of 59 m for MY30. Meanwhile, in ROI Ithaca, both years are
very similar, with median fan lengths of 39.5 m and 38.6 m respectively.
44
4
10
3
10
2
10
1
10
0
0
50
100
200
250
0
50
100
150
Markings per tile
200
250
cl
ei
n
Figure 32: Count distributions of catalog objects per tile, blotches on the left, fans on the right. The bin size is 5
counts in both plots.
Cumulative normalized histogram of fan lengths
0.97
10
2
10
3
10
4
10
5
10
Fraction of fans with given length
Normalized Log-Histogram of fan lengths
rti
A
150
Markings per tile
Pr
10
marking = fan
es
marking = blotch
6
0
50
100
150
200
Fan length [m]
250
300
1.0
0.6
0.4
0.2
0.0
350
0.82
0.8
0
50
100
150
200
Fan length [m]
250
300
350
Figure 33: Normalized histograms of all fan lengths in the catalog. Left: Log-Histogram, Right: Cumulative
Histogram. The median (i.e. fraction of 0.5) value is at 24 m, with 82 % of the fans shorter than 50 and 97 % shorter
than 100 m, as indicated by the lines in the plot.
45
Pr
Maccelsfield
Starburst
Manhattan
Bilbao
Portsmouth
n
region
Ithaca
Manhattan_
Frontinella
cl
ei
BuenosAires
Inca
Giza
Potsdam
Oswego_Edge
0
50
100
150
200
distance_m
250
300
350
Figure 34: Boxplot showing the distributions of fan lengths for our set of regions of interest at the Martian south
pole, over both Martian years of data, MY 29 and 30 (Inca City and Inca City Ridges are combined here due to
their proximity). The boxplot setup uses the standard setup of interquartile range (IQR) for the box and its whiskers
extending to 1.5xIQR, single dots for outliers.
rti
A
es
Fan lengths in different ROIs
46
7. Wind Direction Results from Four Sample Regions of Interest (ROIs)
Pr
es
Early in the mission, HiRISE has defined several regions of interest (ROIs) within the southern
polar areas that have been extensively monitored for seasonal activity ever since (the list of original
seasonal ROIs can be found in Hansen et al. [2010]). We have selected a sub-set of these ROIs to
be analyzed by Planet Four, as shown in Table 1). The map of ROIs’ distribution over the pole is
shown in Fig. 4.
Below we will focus on 4 example ROIs to showcase the use of Planet Four data catalog
and our ability to monitor wind directions using fan markings positions and locations. We have
picked these 4 ROIs (informally named Ithaca, Giza, Manhattan, and Inca City) for regional case
studies of the seasonal winds because the temporal coverage over these locations is the highest. We
describe each ROIs’ general settings and geomorphology based on observations of HiRISE and our
previous works [Hansen et al., 2010; Pommerol et al., 2011]. We then present the wind direction
maps over spring season at each of these locations. The wind rose diagrams for each HiRISE image
separately are available in the supplementary files P4_catalog_v1.0_wind_rose_diagrams.
pdf
A
rti
cl
ei
n
7.1. Ithaca
The Ithaca region is located at southern latitude 85.2◦ , eastern longitude 181.4◦ . This location is away from the permanent polar cap, at the edge of the cryptic region and situated
on a surface that is relatively smooth on a large scale: the digital terrain model produced by
HiRISE (DTEPD_040189_0950_040216_0950_A01) shows vertical elevation variations less than
60 m across the Ithaca region. At the same time, on the meter scale the surface in Ithaca is rough,
showing irregular and uneven bumps and pits. No araneiforms (i.e. radially-organized channels)
were detected Ithaca according to HiRISE imaging, while rare isolated troughs and patterned
ground similar to araneiform troughs are present [Hansen et al., 2010].
During local spring, fan-shaped deposits densely cover the Ithaca region (see an example in
Fig. 35. Opening angles and lengths of the fans were reported to evolve during spring while the
nature of these changes was not quantified [Thomas et al., 2010]. Multiple fans were observed
to emerge from the common vents, at times merging together to create a wider singular fan. The
directions of the fan deposits were noted to be consistent from one Martian year to another with
only little variation.
An interesting detail about Ithaca is very prominent bluish halos and fans that are repeatedly
observed here [Thomas et al., 2010]. In contrast to the more common dark fan-shaped deposits,
these halos and fans have higher albedos, approaching the albedo of fresh ice deposits. In Ithaca
they are also distinctively bluer than the rest of the surface. There are at least two types of such
bright deposits. One type resembles narrow fans that are located centrally over the older dark fans.
These appear early in spring, before Ls = 190°. The other type resembles halos contouring the
pre-existing dark fans. They appear on average later than the narrow bright fans.
In summer (Ls > 270°), the seasonal deposits are mostly invisible in Ithaca. Partially, this is
because the low scale roughness creates a patchy-looking environment with pits being darker than
bumps either due to shadows or dust collecting in depressions.
47
A
rti
cl
ei
n
Pr
es
Fig. 35 shows a typical plot that we will use to analyze derived wind directions in our ROIs.
This particular plot was created from Planet Four data for one HiRISE image (ESP_011931_0945)
taken in Ithaca at Ls = 207°. To create this plot we took all the fan markings over the HiRISE image
and plotted it as a histogram of their directions (top right panel of Fig. 35). Note that, in contrast to
the standard wind rose diagrams showing the directions of the origin of winds, we use this diagram
to show the measured deposition directions caused by the winds, i.e. the opposite from the wind
origins. We decided for this kind of display because it relates more to the actual measurements
performed by the Planet Four project and does not imply any interpretation. The fan direction is
counted clock-wise (CW) from the North Azimuth (NA) direction, where 0° always represents
North, and 270° West. The histogram is not scaled, i.e. the y-axis shows the actual counts of
the fan markings with the direction of each bin in the x-axis. The maximum of the histogram is
the most probable direction for the markings and the width indicates how variable the directions
of the markings are for this particular Ls . The default size of each histogram bin is 3.6 ◦ . In
exceptionally rare cases for a particular image the number of fan markings and thus number of
wind measurements are low. Such cases require special treatment and increase in bin size. On
the top left panel the same data are plotted in the wind rose diagram. This time the histogram is
normalized to highlight the difference in directions if several HiRISE images are plotted in the
same frame. Note that the position of zero (NA direction) depends on the location of ROI, i.e.
the wind rose diagram is map-projected to the location of the data plotted. Thus, the direction
of the fans can be directly compared to the map-projected HiRISE image (bottom panel). In this
particular example one can see that the histogram has 2 peaks that indicate there are two distinct
directions of the fans. This can either be (1) because of overlapping fan deposits from jets that
erupted from the same vents at different times prior to Ls =207.8◦ under different wind regimes; or
(2) because different areas of the ROI have distinctively different wind regimes. In this example
comparing the derived fan directions to the sub-frame of the HiRISE image indicates that the first
case is more probable.
Fig. 36 shows directions of the fan deposits in Ithaca as retrieved by the Planet Four project for
two Martian years: MY29 and MY30. We have separated the spring season into early spring, i.e.
before Ls =210◦ , and late spring, from Ls =210◦ to Ls =270◦ . The panels in this figure are organized
in the way that columns show separation into early and late spring while rows show MY 29 and
MY 30.
Ithaca fans sustain the same direction towards ≈125° through the whole spring in both years
with only a little shift towards East (see also top left panel of Fig. 40). In MY29 fan direction
histograms are wider than in MY30. A narrow histogram is an indication of small deviations of
the governing winds at the times of jet eruptions. The shift of the mean wind direction is less than
10◦ in the early spring and the maximum shift is 25◦ over the whole season in MY29. Histograms
widen with increase of Ls and sometimes develop double maxima indicating more variability in
the marked fan directions. This is also reflected in the increase of the standard deviation towards
the end of spring. It can be attributed to larger wind variability later in spring or that winds become
strong enough to lift the particles from the ground at times between jet eruptions. Over-all MY30
show similar behavior to MY29.
48
es
Pr
n
cl
ei
rti
A
Figure 35: Fan directions in Ithaca region at Ls =207.8◦ (top) and a subframe of HiRISE image ESP_011931_0945
that can be directly compared to the wind rose diagram in the top left panel.
49
es
Pr
n
cl
ei
Figure 36: Direction of fan markings in Ithaca region for early and late spring of MY29 and MY30.
A
rti
7.2. Giza
The Giza region is at southern latitude 84.8◦ , eastern longitude 65.7◦ . It is located closer to
the edge of the permanent cap than Ithaca. It is also near a trough with exposure of southern polar
layered deposits while the area of Giza is flat on km-scale (see HiRISE DTM DTEPC_004736_
0950_005119_0950_A01). On the smaller scales, as can be seen in multiple HiRISE images taken
over this area (including those that were input for Planet Four) the region is covered in modulated
bumps and small ripples. One side of this ROI is covered in yardangs.
Very large and very intricate araneiform structures are located in this region. Their troughs are
narrow, long, with high degrees of branching. These araneiforms are very active in spring: multiple
long and narrow fans emerge from their troughs and cover an extended area. HiRISE detected a
dusty reddish haze over the araneiforms in Giza in several years indicating active loading of dust
into the lower layer of atmosphere. The directions of the fans in the late spring were previously
noted to co-align with yardangs, suggesting that the wind regime in this area in summer stayed
stable for an extended period of time Hansen et al. [2010].
Similar to Ithaca, in Giza we do not observe significant differences in fan directions between
MY29 and MY30 (Fig. 40 lower left panel and Fig. 37). Early images taken before Ls =190◦ show
very narrow histograms with a maximum between 300◦ and 310◦ . The maximum, which marks
the direction of most fans, slowly shifts towards 360◦ . The shift rate is higher than in Ithaca (> 45◦
50
es
Pr
n
cl
ei
Figure 37: Direction of fan markings in Giza region for early and late spring of MY29 and MY30. HiRISE images
used: ESP_011447_0950, ESP_011448_0950, ESP_011777_0950, ESP_011843_0950, ESP_012212_0950,
ESP_012265_0950,
ESP_012344_0950,
ESP_012704_0850,
ESP_012753_0950,
ESP_012836_0850,
ESP_012845_0950,
ESP_020150_0950,
ESP_020401_0950,
ESP_020480_0950,
ESP_020783_0950,
ESP_020902_0950, ESP_021482_0950, ESP_022273_0950.
over the whole spring). The number statistics of fan detection worsens in the late spring in both
years, but it is particularly noticeable in late spring of MY30 (see histograms for the late spring of
MY30). This is explained by decreasing contrast between the fan deposits and undisturbed surface
around fans in late spring images, i.e. the fans blending in with their environment.
A
rti
7.3. Manhattan
The Manhattan region is in a very active area with at least 3 HiRISE ROIs that once were all
considered under this same name. This area is around southern latitude 86◦ , eastern longitude 99◦ ,
as the two above, this is on the edge but still inside the cryptic region. The ROI is located on the
eastern side of a South Polar Layered Deposit (SPLD) trough that in spring is completely covered
with seasonal activity. The area is inclined towards the trough, i.e. in the north-west direction,
however, rather insignificantly. According to the HiRISE DTM (DTEPC_022259_0935_022339_
0935_A01), there is a 270 m elevation change over approximately 8 km (≈2° slope).
Manhattan is covered in well developed interlaced araneiforms. Similar to Giza, the araneiforms
here have thin and long troughs and branch significantly. Aside from araneiforms, the surface in
Manhattan is smooth, even on tens to hundreds meters scales with just several exceptions of shallow irregular pits.
Seasonal activity is extensive in Manhattan, with dark fan deposits that at times develop bright
51
es
Pr
n
Figure 38: Direction of fan markings in Manhattan region for early and late spring of MY29 and MY30.
cl
ei
halos. Intriguingly, araneiforms’ troughs become visibly brighter compared to the rest of surface
around Ls = 200◦ and stay bright almost until the region completely defrosts.
Fans in Manhattan are directed 230◦ from NA direction in the beginning of spring as shown by
the first observations of both analyzed years. This direction shifts during early spring and plateaus
at 290◦ after Ls =220◦ .
A
rti
7.4. Inca City
Inca City is at latitude 81.3◦ , eastern longitude 295.7◦ ; relative to the aforementioned ROIs it
is on the opposite side of the permanent cap and the southern pole. The topography of this location
is the most complex in our list (HiRISE DTM DTEPC_022699_0985_022607_0985_A01). It is a
system of over 300 m-high ridges that crisscross each other at almost right angles forming close-torectangular basins. The slopes of the ridges sometimes exceed 13◦ providing a variety of insolation
environments in a relatively small region. The inner surface of the basins is flat and most of
araneiforms of Inca City are carved in it. The formation of the Inca City ridge system is debated
but most commonly attributed to the interaction of irregularities of the local crust with an impactinduced compaction wave [Kerber et al., 2017].
Araneiforms in Inca City are morphologically different from those in Giza and Manhattan.
They have a well-developed central depression with relatively short troughs extending outwards
and are on average smaller.
Seasonal activity in Inca City starts at the slopes of the ridges [Thomas et al., 2010]. Fan
deposits extend downwards following gravity lines. The fans are very narrow but do not have any
52
A
rti
cl
ei
n
Pr
es
features of the flows (dark flows come later in spring). It is not fully clear if the fans are directed
by the gravity or by downslope winds in this ROI. The surface around and near araneiforms, in the
basin floor, gets covered mostly in blotches suggesting that no significant winds are active inside
the basins.
Directions of fans in Inca City are seemingly disordered, particularly in comparison to the 3
ROIs discussed above. However, Inca City is special in this set because it has prominent topography that the other 3 ROIs lack. Thus the analysis method that works well for our other ROIs might
not be applicable to Inca City. Inca City ridges affect the local deposition of solar energy and influence near-surface winds. Directions of fans in Ithaca, Giza, and Manhattan are modified by near
surface winds that normally pass undisturbed over the whole ROI. In contrast, in Inca City fans are
observed almost exclusively on the slopes of the ridges and are aligned with down-slope direction.
However, these fans appear on the slopes gradually through spring: the first fans according to our
analysis are pointing to the south-west direction (270◦ from NA), i.e. located on south-west facing
slopes. Early observations have the smallest standard deviation indicating smallest variation in
the fan directions (Fig. 39). However, even in the early histograms several local maxima may be
detected. The location of the secondary maxima are determined by the slopes that were covered
by HiRISE image at each Ls . Later in spring the fans start to appear on the slopes with a different
orientation than to the south-west. This widens the histogram for each HiRISE image and makes
the location of the histogram maximum a less and less relevant measure of the mean fan direction.
This results in the larger variation of the mean fan direction and large standard deviations (bottom
right panel of Fig. 40). Local maxima repeatedly occurring at the same directions from image to
image in late spring and the whole scenario repeats in both years with only small variations.
53
A
es
Pr
n
cl
ei
rti
Figure 39: Direction of fan markings in Inca City region for early and late spring of MY29 and MY30.
54
A
es
Pr
n
cl
ei
rti
Figure 40: Direction of fan markings in 4 ROIs vs Ls for MY29 and MY30. Directions are plotted in degrees relative
to NA direction. Error bars represent the standard deviation of the data and not the error on the mean. Prevailing winds
control direction of fans in Ithaca, Manhattan, and Giza because the over-all topography in these ROIs is smooth and
has no obstactles significantly modifying the winds. In Inca City, however, the topography is more prominent, with
3 km-high ridges that break down the general winds and support creation of katabatic flows. Thus, the fans here
follow slopes of the ridges rather than wind direction, which is reflected in the large scatter of mean fan direction and
large standard deviations on mean fan direction.
55
8. Conclusions
cl
ei
n
Pr
es
The Planet Four project has produced a catalog of 159,558 fans and 250,164 blotches (ellipses),
identifying locations of seasonal surface deposits produced by the CO2 jet processes occurring
during spring in the Martian south polar region. The catalog was generated by combining the
assessments made by Planet Four volunteers reviewing a set of 42,904 tiles derived from 221
HiRISE observations obtained over 2 Martian Years, covering a set of 28 regions of interest (ROI)
across the south pole. To date, this catalog serves as the largest reporting of locations, sizes, and
mapping of seasonal deposits on the Martian surface. The Planet Four fan and blotch catalog
constitutes a resource for studying polar winds, climate and polar processes. Using south polar
fans as regional wind markers, the Planet Four catalog can provide tests for and input to global
and regional atmospheric circulation models.
Statistical comparisons between classifications produced by the science team and catalog results for the same image data (Section 5) demonstrate that the bulk composition of the Planet
Four catalog represents a fairly complete picture of the seasonal fans and blotches captured in the
HiRISE images. Trend consistency for fan directions between Mars Year 29 and 30, despite the
fact that most data is being analyzed by different volunteers, further indicates reliability of the
methods presented here (see summary Figure 40). We have gone into considerable detail on the
methodology behind the data in the catalog and are confident that its content can be productively
used by our colleagues for their own research.
For 4 of the 28 ROIs we have presented mean fan directions. In three of these, the fan deposits
appear to be directly modified by near-surface winds at the time of jet eruption; the fourth ROI
shows the strong influence of topography. In ROIs Ithaca, Giza, and Manhattan: The derived mean
winds show no significant inter-annual variability between MY29 and MY30: their direction at the
same Ls are the same with less than 10° variations. In Inca City: The mean direction of the fans
coincides with the direction of slopes and changes over spring while more slopes become exposed
to sunlight and cold jet eruptions happen.
Our analysis in this paper focused on HiRISE observations from seasons 2 (MY29) and 3
(MY30) of the HiRISE southern seasonal processes campaign, and research into inter-annual variability starts to be feasible. However, the HiRISE campaign covers now 6 seasons of monitoring,
and for a number of selected ROIs 5 of these have been or are being analyzed by the Planet Four
project at the time of writing. The results from the analysis of these longer timespans and additional areal coverage will be topics of future publications and data releases.
A
rti
Acknowledgements
The data presented in this paper are the result of the efforts of the Planet Four volunteers, generously donating their time to the Planet Four project, and without whom this work would not have
been possible. Their contributions are individually acknowledged at http://www.planetfour.
org/authors. Additionally we thank all those involved in BBC Stargazing Live 2013. This publication uses data generated via the Zooniverse.org platform, development of which was supported
by the Alfred P. Sloan Foundation. The authors also thank Chris Lintott (University of Oxford),
who had to decline authorship on this Paper. We thank him for his efforts contributing to the
development of the Planet Four website and for his useful discussions.
56
A
rti
cl
ei
n
Pr
es
MES is currently supported by Gemini Observatory, which is operated by the Association of
Universities for Research in Astronomy, Inc., on behalf of the international Gemini partnership
of Argentina, Brazil, Canada, Chile, and the United States of America. MES was also supported
in part by an Academia Sinica Postdoctoral Fellowship and by a National Science Foundation
(NSF) Astronomy and Astrophysics Postdoctoral Fellowship under award AST-1003258.
CM was supported by the 2014 Institute of Astronomy and Astrophysics, Academia Sinica
(ASIAA) Summer Student Program. KMA and MES also thank the attendees of the Workshop
on Citizen Science in Astronomy for the insightful conversations and acknowledge ASIAA and
Taiwan’s Ministry of Science and Technology (MOST) for supporting the workshop. The authors
also thank Greg Hines, Cliff Johnson, Margaret Kosmala, Chris Schaller, Brooke Simmons, and
Ali Swanson for insightful discussions.
This work is also partially enabled by the National Aeronautics and Space Administration
(NASA) support for the Mars Reconnaisance Orbiter (MRO) High Resolution Imaging Science
Experiment (HiRISE) team. This paper includes data collected by the MRO spacecraft and the
HiRISE camera, and we gratefully acknowledge the entire MRO mission and HiRISE teams’
efforts in obtaining and providing the images used in this analysis. The Mars Reconnaissance
Orbiter mission is operated at the Jet Propulsion Laboratory, California Institute of Technology,
under contracts with NASA. The authors also thank Rod Heyd for guidance in extracting the
geographic and location information for HiRISE non-map projected image. This research has
made use of the USGS Integrated Software for Imagers and Spectrometers (ISIS) and of NASA’s
Astrophysics Data System.
KMA and GP were supported for this work by NASA ROSES Solar System Workings grant
NNX15AH36G.
All software created for the pipeline is based on the open source language Python, using the
matplotlib library [Hunter, 2007] for plotting, the pandas library for data wrangling and analysis
[McKinney, 2010], the scikit-learn library [Pedregosa et al., 2011] for the clustering of Planet Four
markings and other pre- and post-processing tasks, the IPython and Jupyter system for everday
computing [Perez and Granger, 2007], and the SciPy tools on a daily basis [Jones et al., 2001].
57
Appendix A. The Zooniverse’s Ouroboros Web Plateform
A
rti
cl
ei
n
Pr
es
In this Section, we briefly describe the Zooniverse’s Ouroboros web platform and describe
how it interacts with the Planet Four classification interface. The Planet Four website and the
Ouroboros platform are both hosted on Amazon Web Services. This enables the ability to rapidly
scale up the number of servers based on the demand on the site, including handling the large
number of classifiers during Stargazing Live 2013. The Planet Four classification interface is
a JavaScript and coffee script application that presents the classifier with the HiRISE tile and
enables the volunteer to draw markers on the image and submit them for storage in the Planet Four
classification database. The Zooniverse’s Ouroboros platform, written in Ruby on Rails, handles
the back end storage of classifications in a Mongo database and determines the next tile that should
be sent to a given Planet Four classifier for review.
Active tiles are shown to 30–100 classifiers before being retired from rotation. Once a classification is complete, the Planet Four interface sends the information via the Ouroboros Application
Programming Interface (API) to be stored in the database and to update the classification count
for the respective tile. If the activity on the website is low, this step is done immediately. If site
traffic is high, for example 70,000 people on the website at once (such as during launch of the
project), Ouroboros is designed to queue the classifications and store them asynchronously to the
database so as not to impair the speed and performance of the Planet Four website. In this case
the classification counts for the tiles and the list of tiles a registered or non-registered classifier has
seen is not updated in live time
The Planet Four web interface queries the Ouroboros API to identify the next tile to present
to a classifier. Ouroboros checks the database and selects a random active tile that has not been
previously reviewed by the Zooniverse registered user or non-logged-in session. At any given
time, Ouroboros readies a list of 5 tiles that the classifier has not seen. When presented with a
request to see another image, the next in this list is sent back by the API. Typically this means
the classifier rarely if ever is presented with a tile to review twice. We note there was a bug in
Ouroboros at launch that made repeats more prevalent. In Appendix B we describe our methods
to cleanse duplicate and spurious classifications from our final data reduction.
We note that in the Zooniverse’s Ouroboros framework, refreshing the Planet Four interface in
the browser will result in a new tile being selected and displayed without updating the classification database. Refreshing the browser is just as easy as hitting the ‘Finished’ and ‘Next’ buttons
to move on to a new image, so we do not believe this has any significant impact on classifier behavior. We mention it for completeness only. Also for the majority of the Season 2 and Season 3
classifications, a memory leak in the drawing library would cause the web browser to crash after
a rather large number of fans were drawn in the image (approximately over 30–50 sources). This
impacted a very small fraction of tiles.
Appendix B. Handling of Duplicate or Spurious Classifications
With the Zooniverse Ouroboros queuing system (described in Appendix A), it is possible that
a duplicate classification may occur, but these instances should be rare. A software bug in the
Ouroboros platform caused a number of classifiers to receive the same cutout they had previously
58
cl
ei
n
Pr
es
classified before. Duplicate classifications are only a small portion of the data-set, comprising
1.9 % percent of all classifications produced, and typically, a few classifications or less per Planet
Four image tile were duplicates in those cases.
In order to treat each classification as an independent assessment, we removed all duplicate
classifications, keeping only the first response for a given registered user/non-logged-in session
for a given cutout.
We also found a concentration of markings positioned at the top left corner (x=0, y=0) of the
marking interface, with nearly all having default values for the other recorded parameters. Only
0.12 % of the 9,631,517 markings recorded for Seasons 2 and 3 are effected. Further investigation
shows that less than 7 % of fan and blotch markings with default parameters with x=0 or y=0 are
not centered at the origin. Thus, we believe these origin default-valued markings are due to a
javascript error. Therefore, we simply delete them from the database, but keep any other markings
associated with those effected classifications. Additionally 33 markings (∼0.003 % of all entries
in the Planet Four classification database) do not have all of the required parameters that should
have been recorded. We believe this is to due a singularity in the drawing tool for that marker,
and we remove that entry from the database. There are also positions in the database recorded for
a handful of fans and blotches significantly out of the bounds of the user interface. A classifier
can move a marker drawn outside the edge of the image, to better capture the center position of a
feature, but these positions are well outside the image region. This represents well less than 1 %
of all classifications, and we have removed them from the analysis presented here. All statistics
and values reported in this Paper are after the filtering described above.
Appendix C. Raw Classification Data
A
rti
Here we provide additional details about the raw classification data provided in the online
supplementary data file8 . It is written in the binary HDF5 format, in the variant produced by the
pandas library (supported by the PyTables library9 ).
The general structure is as follows: Each classification submission by an individual volunteer
creates a classification_id. All objects created by this volunteer receives the same classification_id,
with the marking data for each object being one entry in the classification database. Each data row
also has a marking column that identifies if this data is for a fan, a blotch, an interesting feature
that will have the string value “interesting” in the marking column, or “none”, when the volunteer
did not create any marking object. Below we describe the columns available in this database:
Column name
Example value
classification_id
50ecaaf760d4050d21000414 Unique ID for each classification by
a Planet Four volunteer
2013-01-08 23:25:43
time of submission
APF0000p9t
Planet Four tile identifier
created_at
tile_id
Description
8 2018-02-11_planet_four_classifications_queryable_cleaned_seasons2and3.h5
9 http://pandas.pydata.org/pandas-docs/stable/io.html#hdf5-pytables
59
x_tile
y_tile
cl
ei
acquisition_date
local_mars_time
x
y
image_x
image_y
rti
A
es
marking
Pr
user_name
ESP_021491_0950
HiRISE observation identifier
http://www.
URL to image data for this Planet
planetfour.org/
Four tile
subjects/standard/
50e741555e2ed211dc002346.
jpg
abc
Originally, the Zooniverse username
or non-logged-in session ID. For privacy concerns, we have converted
these to anonymous IDs.
blotch
identifier for what data in row is for:
blotch, fan, interesting, none
1
x coordinate of tile inside larger
HiRISE image frame. Starts at 1 in
upper left of the HiRISE image, increases to the right.
2
y coordinate of tile inside larger
HiRISE image frame. Starts at 1 in
upper left of the HiRISE image and
increase downwards.
2011-01-01 00:00:00
date only for HiRISE observation
time (ignore hours)
5:43 PM
local mars time for given acquisition
date
553.65
x pixel coordinate of object in Planet
Four tile. Starts at 0 in upper left, increases to the right.
355.817
y pixel coordinate of object in Planet
Four tile. Starts at 0 in upper left, increases downwards.
2033.65
x pixel coordinate of object in original HiRISE image. Starts at 0 in upper left, increases to the right.
37071.8
y pixel coordinate of object in original HiRISE image. Starts at 0 in upper left, increasing downwards.
295.195
Semi-major axis of blotch object in
pixels. NAN if not applicable (N/A)
294.715
Semi-minor axis of blotch object in
pixels. NAN if N/A
NaN
Length of fan object in pixels. NAN
if data row is for blotch or interesting
n
image_name
tile_url
radius_1
radius_2
distance
60
spread
NaN
version
NaN
x_angle
0.887549
y_angle
0.460713
Orientation of marking object with
respect to tile image x-axis in degrees. Positiv clock-wise, zero to image right (same definition as HiRISE)
Opening angle of fan objects in degrees. NAN if N/A
version of tool used to create fan.
NAN if N/A
cartesian x coordinate of angle column on unit circle
cartesian y coordinate of angle column on unit circle
es
27.4331
Pr
angle
cl
ei
n
The Planet Four classification interface recorded a different angle than the intended spread
angle from the fan marking tool. This was identified and subsequently fixed in the software.
The correct spread angle is recoverable from the values stored in the database. We denote those
markings generated before the patch with version flag set to 1.0 and those after with the version
flag set to 2.0. We provide the corrected spread angle for the fans affected, but leave that version
flag in the final catalog, for reference. To gather statistics on the understanding of the tutorial,
the Planet Four classification database contains all the tutorial markings, indicated by a HiRISE
image name of ‘tutorial’. For the delivered raw classification database, the fan angles range has
been converted from -180–180 to 0–360, while the range of the blotch angles have been converted
to 0–180, due to their rotational symmetry.
Appendix D. Pipeline outputs
The intermediate stages of the pipeline, as output by our clustering and combination pipeline
are identified with different level identifiers 1A, 1B, and 1C, indicating different stages of the
processing pipeline, where the processing is done on a per-tile-id level. After this is done, the final
step of combines all the data from the ten-thousands of tile_id folders into a set of summarizing
CSV files.
Appendix D.1. Directory file structure
The directory file structure of the pipeline products are as follows (examples in parentheses):
A
rti
• HiRISE observation ID (ESP_011350_0945)
– Planet Four tile ID (APF0000any)
* Level 1A (L1A/APF0000any_L1A_fans.csv)
* Level 1B (L1B/APF0000any_L1B_fnotches.csv)
* Level 1C with cut value 0.5 in directory name (L1C_cut_0.5/APF0000any_L1C_
cut_0.5_blotches.csv)
with the list of HiRISE observation IDs identifying the HiRISE observations that went into
Planet Four for this database.
61
x_tile y_tile
y
image_x
2.0 26.0 123.611111 455.666667 863.611111 14155.666667
2.0 26.0 157.000000 391.800000 897.000000 14091.800000
distance
angle
spread version
image_name
NaN
NaN
NaN
NaN
y_angle n_votes image_id
1.0 -0.691035 -0.660663
1.0 -0.360802 -0.927999
cl
ei
0 81.884266 223.712817 71.559689
1 57.742472 248.754137 52.521798
x_angle
9 APF0000any
10 APF0000any
marking_id
0 ESP_011350_0945 F006de3
1 ESP_011350_0945 F006de4
Additionally, each L1A folder contains a text file called clustering_setttings.yaml that
summarizes the clustering settings used for these data for reference. epsilon values are static and
all the same, but the min_samples value is dynamically calculated, see Section 4.2.1 for details.
Appendix D.2.2. Level 1B
At level 1B, the combination pipeline has determined with objects are so close to each other
that they should be considered for merging (see Section 4.3). The outputs are between one and
three files this time. One only, in case all fans and blotches found were so close that they need
to be evaluated by their classification votes. Usually, though, there are two to three files, where
one files stores the objects that need voting, and the other file(s) store the objects that don’t have
any close neighbors and will simply be copied over to the final level later. The fans and blotches
in these latter files will receive the ‘vote_ratio’ value of 1.0, indicating that they had a “perfect”
probability for being a fan, or blotch, respectively. The third file that keeps the close objects for
the later thresholding contains these temporary meta-objects in sets of 2 rows, one fan and one
blotch, and has the term “fnotch” in its filename (fnotches: FaN–blOTCH). This file contains all
the clustering statistics data from L1A required to make a cut decision for L1C, with the data for
rti
A
image_y radius_1 radius_2
n
0
1
x
Pr
es
Appendix D.2. Pipeline stage levels
Appendix D.2.1. Level 1A
Level 1A is the data that is directly output from clustering and averaging the cluster members
into average markings, as described in Section 4.2. Here, the biggest reduction in terms of numbers
of objects in the system occurs, as all the different volunteers data are being combined into one
object when the clustering process has determined the markings to be part of one cluster. All
newly created average fans and blotches are summarized into one fan and blotch summary file
respectively, which each line representing the mean object from averaging all cluster members.
As an example, the content of APF0000p3q_L1A_fans.csv is shown below. When the column
name matches those given in Appendix Appendix C, they have the same meaning. The two new
columns are n_votes, which records how many members the cluster had that was used to produce
this averaged object, and marking_id, which have been created at this stage of the pipeline and
serve as a tracer throughout the different pipeline outputs:
62
each meta-object being sorted in alternating rows. Here are the first four rows of the fnotch file
APF0000any_L1B_fnotches.csv:
image_y
223.712817 81.884266 APF0000any ESP_011350_0945 863.611111
67.261720
NaN APF0000any ESP_011350_0945 838.395834
247.146845 58.742330 APF0000any ESP_011350_0945 832.000000
70.684606
NaN APF0000any ESP_011350_0945 821.666667
14155.666667
14123.875000
14306.400000
14281.428571
fan
blotch
fan
blotch
fan
blotch
fan
blotch
F006de3
B0071f2
F006de5
B0071ed
radius_1
radius_2
9
NaN
NaN 71.559689
8 49.309277 36.981958
NaN
5
NaN
NaN 81.171448
7 35.324591 26.493443
NaN
x_tile
y
2.0
2.0
2.0
2.0
455.666667
423.875000
606.400000
581.428571
y_angle y_tile vote_ratio
-0.660663
0.907431
-0.919245
0.852341
26.0
26.0
26.0
26.0
0.539412
0.460588
0.426667
0.573333
x
x_angle
1.0
NaN
1.0
NaN
123.611111
98.395834
92.000000
81.666667
-0.691035
0.379131
-0.387419
0.217508
cl
ei
This data stage L1B is what can be used to create a different significance threshold cut for the
final data , by filtering on the data column vote_ratio in the fnotch file for the required threshold
value. For example, if a higher threshold on the probability for a fan is wanted, e.g. 0.8, one would
filter out all rows that start with “fan” with a vote_ratio value below 0.8. One then needs to decide
if one wants to use this threshold as a general “certainty” filter and simply don’t take any object
with a vote_ratio < 0.8, or if one wants the blotch to appear instead of a fan.
Appendix D.2.3. Level 1C
This level contains the data of the final catalog files, but split-up into each Planet Four tiles.
At the end of the thresholding stage (Section 4.3), appending the data for the rows that pass the
threshold filters into the respective blotch and fan files and copying these completed files into the
L1C directory completes that thresholding step and fills up the L1C folders. A final tool walks
through each folder and collects all the fan and blotch data into one summary file each, followed
by merge operations with meta-data that is useful for future analysis. These files are described in
the next section, Appendix E.
rti
A
spread version
es
image_x
marking_id n_votes
image_name
Pr
fan
blotch
fan
blotch
distance image_id
n
angle
63
Appendix E. Planet Four Catalog files description
es
Our catalog product files consist of one CSV result file per fan and blotch markings, a Planet
Four tile meta-data file, and a HiRISE observation meta-data file. Below, each subsection describes
the data columns for these files.
For convenience we provide both the planeto-centric and planeto-graphic latitudes for each
fan’s base and blotch’s center point. Longitudes are measured 0–360, increasing positive to the
East. Note that, because the HiRISE images were not co-registered, the conversion of pixel to
geographical coordinates can be offset by up to 100 HiRISE pixels between data from different
HiRISE images.
Column name
Example
value
marking_id
F00004ab
Pr
Appendix E.1. Fan catalog
Description
A
rti
cl
ei
n
Consistent identifier for marking after clustering.
Fxxx=Fan, Bxxx=Blotch
angle
185.4
Alignment angle of marking measured from 3
o’clock direction, clockwise
distance
179.6
Length of fan in pixels
tile_id
APF0000cia tile identifier in the Planet Four system
image_x
3391.2
Base X coordinate [px] in original HiRISE image
image_y
5640.6
Base Y coordinate [px] in original HiRISE image
n_votes
15
# of markings that went into this average object.
obsid
ESP_012079_ HiRISE image observation id
0945
spread
21.346
Spreading angle of Fans
version
1
Version number of Fan model used in Planet Four
(see Appendix Appendix C)
vote_ratio
1.0
Ratio of votes from a potential combination step.
Value of 1.0 means only fan votes occurred.
x
431.206
Base X pixel coordinate in the Planet Four tile
y
160.6
Base Y pixel coordinate in the Planet Four tile
x_angle
-0.995088
Polar X coordinate of alignment angle
y_angle
-0.0938355
Polar Y coordinate of alignment angle
l_s
214.785
Solar longitude of HiRISE observation
map_scale
0.25
Factor for scaling distances to correct for HiRISE
binning mode
north_azimuth
126.857
Direction of North in the original unprojected
HiRISE input image
BodyFixedCoordinateX -67.2071
Base X coord. [km] in Mars-fixed ref. frame
BodyFixedCoordinateY 257.05
Base Y coord. [km] in Mars-fixed ref. frame
BodyFixedCoordinateZ -3370.63
Base Z coord. [km] in Mars-fixed ref. frame
64
Appendix E.2. Blotch catalog
Column name
Example
value
marking_id
B00004ab
image_y
n_votes
obsid
cl
ei
radius_1
radius_2
vote_ratio
Consistent identifier for marking after clustering.
Fxxx=Fan, Bxxx=Blotch
185.4
Alignment angle of marking measured from 3
o’clock direction, clockwise
APF0000cia tile identifier in the Planet Four system
3391.2
Center X pixel coordinate in the original HiRISE
image
5640.6
Center Y pixel coordinate in the original HiRISE
image
15
Number of markings used for the average object
ESP_012079_ HiRISE image observation id
0945
10.4
Semi-major axis of Blotch
15.2
Semi-minor axis of Blotch
0.0
Ratio of votes from a potential combination step.
Value of 0.0 means only blotch votes occurred.
431.206
Center X pixel coordinate in the Planet Four tile
160.6
Center Y pixel coordinate in the Planet Four tile
-0.995088
Polar X coordinate of alignment angle
-0.0938355
Polar Y coordinate of alignment angle
214.785
Solar longitude of HiRISE observation
0.25
Factor for scaling distances to correct for HiRISE
binning mode
126.857
Direction of North in the original unprojected
HiRISE input image
-67.2071
Center X coord. [km] in Mars-fixed ref. frame
257.05
Center Y coord. [km] in Mars-fixed ref. frame
-3370.63
Center Z coord. [km] in Mars-fixed ref. frame
-85.493
Latitude of catalog object (-centric)
-85.5457
Latitude of catalog object (-graphic)
104.652
Longitude of catalog object (Positive East 360)
Pr
tile_id
image_x
Description
n
angle
x
y
x_angle
y_angle
l_s
map_scale
north_azimuth
rti
A
Latitude of catalog object (-centric)
Latitude of catalog object (-graphic)
Longitude of catalog object
BodyFixedCoordinateX
BodyFixedCoordinateY
BodyFixedCoordinateZ
PlanetocentricLatitude
PlanetographicLatitude
Longitude
es
PlanetoCentricLatitude -85.493
PlanetoGraphicLatitude -85.5457
Longitude
104.652
65
Example
value
Description
BodyFixedCoordinateX
BodyFixedCoordinateY
BodyFixedCoordinateZ
PlanetocentricLatitude
PlanetographicLatitude
Longitude
tile_id
obsid
x_hirise
-67.2071
257.05
-3370.63
-85.493
-85.5457
104.652
APF0000cia
PSP_003092_
0985
840
x_tile
5
y_hirise
648
y_tile
11
Center X coord. [km] in Mars-fixed ref. frame
Center Y coord. [km] in Mars-fixed ref. frame
Center Z coord. [km] in Mars-fixed ref. frame
Latitude of catalog object (-centric)
Latitude of catalog object (-graphic)
Longitude of catalog object (Positive East 360)
tile identifier in the Planet Four system
HiRISE observation ID of the source image for this
tile
X pixel coordinate of the tile center in the HiRISE
image
X index of the Planet Four tile inside the HiRISE
image (1-based)
Y pixel coordinate of the tile center in the HiRISE
image
Y index of the Planet Four tile inside the HiRISE
image (1-based)
cl
ei
n
Column name
Appendix E.4. HiRISE observations catalog
This catalog provides the user with a list of HiRISE images and their meta-data that were used
to create the Planet Four results presented here. The columns with capital letters were directly
taken from the published cumulative EDR index10 . The decimal digits precision was set to 7,
guided by the Latitude/Longitude significant bits for a HiRISE pixel diameter on the ground for a
1x1 binning observation.
rti
A
Pr
es
Appendix E.3. Planet Four tile catalog
Here we provide the data required to position the Planet Four tiles both back into HiRISE images, if so required, or directly onto the Martian surface, by using the provided latitude/longitude
values or their map-value equivalents in the BodyFixed-Mars frame in a rectangular coordinate
system, measuring kilometers from the south pole. The coordinate values come directly from the
ISIS campt utility, while the x_tile and y_tile position indices of tiles inside the HiRISE image are
the result of the splitting up routine that was developed by the Zooniverse team at the beginning
of the project. All coordinates were calculated at the tile center pixel coordinate of (420, 324).
The decimal digits precision was set to 7, guided by the Latitude/Longitude significant bits for a
HiRISE pixel diameter on the ground for a 1x1 binning observation.
Column name
Example value
Description
10 https://hirise-pds.lpl.arizona.edu/PDS/INDEX/EDRCUMINDEX.TAB
66
OBSERVATION_ID
IMAGE_CENTER_LATITUDE
ESP_011296_0975
-82.1965000
n
Pr
es
HiRISE observation identifier
Planetographic latitude of the HiRISE image center
IMAGE_CENTER_LONGITUDE 225.2530000
Longitude of HiRISE image center (positive
west 360)
SOLAR_LONGITUDE
178.8330000
Solar longitude of HiRISE image. Equivalent to column l_s in the fan and blotch catalogs.
START_TIME
2008-12-23 16:15:26 UTC time of observation start
map_scale
1.0000000
Units: pixel/m. Calculated from EDRCUMINDEX by 0.25*BINNING
north_azimuth
110.6001067
The median north azimuth value for the
HiRISE image, recalculated with ISIS’
campt, due to known errors in HiRISE EDR
index file.
# of tiles
91
the number of created Planet Four tiles per
HiRISE observation. Depends on original
image size.
Appendix F. Extended validation results
A
rti
cl
ei
In addition to the combined fan and blotch count we explored in Section 5, we further explore
here how well the Planet Four catalog identifies fans (those dark sources with a clear direction
and starting point) versus blotches, separately. We separate the catalog and gold standard classifications by marker type in Figures F.41 to F.44. The data processing pipeline plays a significant
role in the completeness of the catalog. At the Thresholding stage, our data processing algorithm
determines which clusters will ultimately become fans with a value of P(fan) > 0.5. Like for the
total number of sources, the number distribution of fans and the number distribution of blotches
matches the expert assessments and is within the 3-σ uncertainty [Kraft et al., 1991]. Thus, in
most cases where the science team member marked a fan, the catalog also identifies this source as
fan. Based on these results, we have high confidence in our fan and blotches identifications within
the Planet Four catalog.
67
es
Pr
101
cl
ei
100
0
10
20
30
# of fans per Planet Four tile
40
50
Figure F.41: Comparing numbers of identified fans per Planet Four tile between experts and the catalog data; here,
for the 192 tile_ids that were classified by all experts. Bin size is 5, each bin is directly compared between the data
from all experts GP (blue), MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut off at
60, ommiting single entry bins above.
rti
A
GP
MES
KMA
catalog
n
# of tiles
Common Expert data vs Catalog: Fans only
68
es
Expert vs Catalog object identification frequency: Fans only
GP
catalog
101
100
0
10
20
30
40
60
70
101
cl
ei
100
0
10
20
30
40
50
60
70
102
# of tiles
80
MES
catalog
80
KMA
catalog
101
100
0
10
20
30
40
50
# of fans per Planet Four tile
60
70
80
Figure F.42: Comparing numbers of identified fans per Planet Four tile between experts and the catalog data. Bin
size is 5, each bin is directly compared between the data from all experts GP (blue), MES (orange), KMA (grey) and
the catalog results (brown). Binning max was cut off at 85, ommiting single entry bins above.
rti
A
50
n
# of tiles
102
Pr
# of tiles
102
69
GP
MES
KMA
catalog
101
100
0
10
30
40
50
Pr
Figure F.43: Comparing numbers of identified blotches per Planet Four tile between experts and the catalog data;
here, for the 192 tile_ids that were classified by all experts. Bin size is 5, each bin is directly compared between the
data from all experts GP (blue), MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut off
at 60, ommiting single entry bins above.
Expert vs Catalog object identification frequency: Blotches only
GP
catalog
101
100
0
10
20
30
40
50
cl
ei
# of tiles
102
n
# of tiles
102
60
70
80
MES
catalog
101
100
0
10
20
30
40
50
60
70
# of tiles
80
KMA
catalog
102
101
rti
A
20
# of blotches per Planet Four tile
es
# of tiles
Common Expert data vs Catalog: Blotches only
100
0
10
20
30
40
50
# of blotches per Planet Four tile
60
70
80
Figure F.44: Comparing numbers of identified blotches per Planet Four tile between experts and the catalog data.
Bin size is 5, each bin is directly compared between the data from all experts GP (blue), MES (orange), KMA (grey)
and the catalog results (brown). Binning max was cut off at 85, ommiting single entry bins above.
70
cl
ei
n
Pr
es
Appendix F.1. Example tile comparisons
In Figures F.45 and F.46 we show an example comparison of volunteer’s markings with those
performed by the science team. The aforementiend slight deviations of the science team members
with each other is visible, however, it is clear that the catalog wind directions in Fig. F.45 are well
reproduced by both the specialists and the volunteers. The results for blotches in Fig. F.46 are very
comparable, with the added simplification that blotches have a much reduced directivity compared
to fans.
A
rti
Figure F.45: Comparing volunteers’ markings and the resulting clustering with the markings performed by science
team members for Planet Four tile ID APF0000hqn of HiRISE image ESP_012316_0925. The extended fan center
lines are 3 times exaggerated fan lengths to indicate the general trend of fan directions for easy visual comparison.
The derived wind directions compare very well between the catalog and the science team data.
71
A
es
Pr
n
cl
ei
rti
Figure F.46: Comparing volunteers’ markings and the resulting clustering with the markings performed by science
team members for Planet Four tile APF000018t of HiRISE image ESP_012889_0985. The blotches are very well
comparable between the science team and the volunteers, with slight disagreements between the science team members.
72
References
A
rti
cl
ei
n
Pr
es
Alger, M.J., Banfield, J.K., Ong, C.S., Rudnick, L., Wong, O.I., Wolf, C., Andernach, H., Norris, R.P., Shabala, S.S.,
2018. Radio galaxy zoo: machine learning for radio source host galaxy cross-identification. Mon. Not. R. Astron.
Soc. 478, 5547–5563.
Anderson, J.A., Sides, S.C., Soltesz, D.L., Sucharski, T.L., Becker, K.J., 2004. Modernization of the Integrated
Software for Imagers and Spectrometers, in: Mackwell, S., Stansbery, E. (Eds.), Lunar and Planetary Science
Conference, p. 2039.
Aye, K.M., Portyankina, G., Thomas, N., 2010. Semi-Automatic Measures of Activity in the Inca City Region of Mars
Using Morphological Image Analysis, p. 2707. URL: http://adsabs.harvard.edu/cgi-bin/nph-data_
query?bibcode=2010LPI....41.2707A&link_type=ABSTRACT.
Banerji, M., Lahav, O., Lintott, C.J., Abdalla, F.B., Schawinski, K., Bamford, S.P., Andreescu, D., Murray, P., Jordan Raddick, M., Slosar, A., Szalay, A., Thomas, D., Vandenberg, J., 2010. Galaxy zoo: reproducing galaxy
morphologies via machine learning. Mon. Not. R. Astron. Soc. 406, 342–353.
Becker, K.J., Anderson, J.A., Sides, S.C., Miller, E.A., Eliason, E.M., Keszthelyi, L.P., 2007. Processing HiRISE
Images Using ISIS3, in: Lunar and Planetary Science Conference, p. 1779.
Bird, R., Daniel, M.K., Dickinson, H., Feng, Q., Fortson, L., Furniss, A., Jarvis, J., Mukherjee, R., Ong, R., Sadeh, I.,
Williams, D., 2018. Muon hunter: a zooniverse project arXiv:1802.08907.
Bowley, C., Mattingly, M., Barnas, A., Ellis-Felege, S., Desell, T., 2018. Detecting wildlife in unmanned aerial
systems imagery using convolutional neural networks trained with an automated feedback loop, in: Computational
Science – ICCS 2018, Springer International Publishing. pp. 69–82.
Bugiolacchi, R., Bamford, S., Tar, P., Thacker, N., Crawford, I.A., Joy, K.H., Grindrod, P.M., Lintott, C., 2016.
The Moon Zoo citizen science project: Preliminary results for the Apollo 17 landing site. Icarus 271, 30–48.
doi:10.1016/j.icarus.2016.01.021.
Clancy, R., Sandor, B., Wolff, M., 2000. An intercomparison of ground-based millimeter, MGS TES, and Viking
atmospheric temperature measurements- Seasonal and interannual variability of temperatures . . . . Journal of geophysical . . . URL: http://www.agu.org/journals/je/je0004/1999JE001089/pdf/1999JE001089.pdf.
Crowston, K., Fagnot, I., 2008. The motivational arc of massive virtual collaboration, in: Proceedings of the IFIP WG
9.5 Working Conference on Virtuality and Society: Massive Virtual Communities, Lüneberg, Germany.
de Villiers, S., Nermoen, A., Jamtveit, B., Mathiesen, J., Meakin, P., Werner, S.C., 2012. Formation of Martian
araneiforms by gas-driven erosion of granular material. Geophysical Research Letters 39, L13204. doi:10.1029/
2012GL052226.
Ester, M., Kriegel, H.P., Sander, J., Xu, X., 1996. A density-based algorithm for discovering clusters in large spatial
databases with noise. Kdd URL: http://www.aaai.org/Papers/KDD/1996/KDD96-037.pdf.
Ewing, R.C., Peyret, A.P.B., Kocurek, G., Bourke, M., 2010. Dune field pattern formation and recent transporting
winds in the olympia undae dune field, north polar region of mars. J. Geophys. Res. 115, E11007.
Fischer, D.A., Schwamb, M.E., Schawinski, K., Lintott, C., Brewer, J., Giguere, M., Lynn, S., Parrish, M., Sartori,
T., Simpson, R., Smith, A., Spronck, J., Batalha, N., Rowe, J., Jenkins, J., Bryson, S., Prsa, A., Tenenbaum, P.,
Crepp, J., Morton, T., Howard, A., Beleu, M., Kaplan, Z., vanNispen, N., Sharzer, C., DeFouw, J., Hajduk, A.,
Neal, J.P., Nemec, A., Schuepbach, N., Zimmermann, V., 2012. Planet Hunters: The first two planet candidates
identified by the public using the Kepler public archive data. Monthly Notices of the Royal Astronomical Society
419, 2900–2911. doi:10.1111/j.1365-2966.2011.19932.x.
Fortson, L., Masters, K., Nichol, R., Borne, K.D., Edmondson, E.M., Lintott, C., Raddick, J., Schawinski, K., Wallin,
J., 2012. Galaxy Zoo: Morphological Classification and Citizen Science, in: Way, M.J., Scargle, J.D., Ali, K.M.,
Srivastava, A.N. (Eds.), Advances in Machine Learning and Data Mining for Astronomy. Chapman & Hall/CRC.
Data mining and Knowledge Discovery, pp. 213–236.
Greeley, R., Arvidson, R.E., Barlett, P.W., Blaney, D., Cabrol, N.A., Christensen, P.R., Fergason, R.L., Golombek,
M.P., Landis, G.A., Lemmon, M.T., Others, 2006. Gusev crater: Wind-related features and processes observed by
the mars exploration rover spirit. Journal of Geophysical Research: Planets 111.
Hansen, C.J., Thomas, N., Portyankina, G., McEwen, A., Becker, T., Byrne, S., Herkenhoff, K., Kieffer, H., Mellon,
M., 2010. HiRISE observations of gas sublimation-driven activity in Mars’ southern polar regions: I. Erosion of
the surface. Icarus 205, 283–295. doi:10.1016/j.icarus.2009.07.021.
73
A
rti
cl
ei
n
Pr
es
Hansen, G.B., 2005. Ultraviolet to near-infrared absorption spectrum of carbon dioxide ice from 0.174 to 1.8 mm.
Journal of Geophysical Research 110, E11003. doi:10.1029/2005JE002531.
Hunter, J.D., 2007. Matplotlib: A 2D Graphics Environment. Comput. Sci. Eng. 9, 90–95. doi:10.1109/mcse.
2007.55.
Jones, E., Oliphant, T., Peterson, P., 2001. {SciPy}: Open source scientific tools for {Python} URL: http://www.
scipy.org.
Kaufmann, E., Hagermann, A., 2016. Experimental investigation of insolation-driven dust ejection from Mars’ CO 2
ice caps. Icarus 282, 118–126. doi:10.1016/j.icarus.2016.09.039.
Kerber, L., Dickson, J.L., Head, J.W., Grosfils, E.B., 2017. Polygonal ridge networks on Mars: Diversity of morphologies and the special case of the Eastern Medusae Fossae Formation. Icarus 281, 200–219. doi:10.1016/j.
icarus.2016.08.020.
Kieffer, H.H., 2007. Cold jets in the Martian polar caps. Journal of Geophysical Research 112, 08005. doi:10.1029/
2006JE002816.
Kraft, R.P., Burrows, D.N., Nousek, J.A., 1991. Determination of confidence limits for experiments with low numbers
of counts. The Astrophysical Journal 374, 344–355. doi:10.1086/170124.
Leighton, R.B., Murray, B.C., 1966. Behavior of Carbon Dioxide and Other Volatiles on Mars. Science 153, 136–144.
doi:10.1126/science.153.3732.136.
Lintott, C., Schawinski, K., Bamford, S., Slosar, A., Land, K., Thomas, D., Edmondson, E., Masters, K., Nichol,
R.C., Raddick, M.J., Szalay, A., Andreescu, D., Murray, P., Vandenberg, J., 2011. Galaxy Zoo 1: Data release
of morphological classifications for nearly 900 000 galaxies. Monthly Notices of the Royal Astronomical Society
410, 166–178. doi:10.1111/j.1365-2966.2010.17432.x.
Lintott, C.J., Schawinski, K., Slosar, A., Land, K., Bamford, S., Thomas, D., Raddick, M.J., Nichol, R.C., Szalay, A.,
Andreescu, D., Murray, P., Vandenberg, J., 2008. Galaxy Zoo: Morphologies derived from visual inspection of
galaxies from the Sloan Digital Sky Survey. Monthly Notices of the Royal Astronomical Society 389, 1179–1189.
doi:10.1111/j.1365-2966.2008.13689.x.
Marshall, P.J., Lintott, C.J., Fletcher, L.N., 2014. Ideas for Citizen Science in Astronomy. ArXiv e-prints
arXiv:1409.4291.
McEwen, A.S., Eliason, E.M., Bergstrom, J.W., Bridges, N.T., Hansen, C.J., Delamere, W.A., Grant, J.A., Gulick,
V.C., Herkenhoff, K.E., Keszthelyi, L., Kirk, R.L., Mellon, M.T., Squyres, S.W., Thomas, N., Weitz, C.M., 2007.
Mars Reconnaissance Orbiter’s High Resolution Imaging Science Experiment (HiRISE). Journal of Geophysical
Research: Planets 112, E05S02. doi:10.1029/2005JE002605.
McKinney, W., 2010. Data Structures for Statistical Computing in Python, in: van der Walt, S., Millman, J. (Eds.),
Proceedings of the 9th Python in Science Conference, pp. 51–56.
Newman, C.E., Gómez-Elvira, J., Marin, M., Navarro, S., Torres, J., Richardson, M.I., Battalio, J.M., Guzewich,
S.D., Sullivan, R., de la Torre, M., Others, 2017. Winds measured by the rover environmental monitoring station (REMS) during the mars science laboratory (MSL) rover’s bagnold dunes campaign and comparison with
numerical modeling using MarsWRF. Icarus 291, 203–231.
Nguyen, T., Pankratius, V., Eckman, L., Seager, S., 2018. Computer-aided discovery of debris disk candidates: A
case study using the Wide-Field infrared survey explorer (WISE) catalog. Astronomy and Computing 23, 72–82.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss,
R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E., 2011. Scikitlearn: Machine Learning in Python. Journal of Machine Learning Research 12, 2825–2830.
Peng, T.r., English, J.E., Silva, P., Davis, D.R., Hayes, W.B., 2018. SpArcFiRe: morphological selection effects due
to reduced visibility of tightly winding arms in distant spiral galaxies. Mon. Not. R. Astron. Soc. 479, 5532–5543.
Perez, F., Granger, B.E., 2007. IPython: A System for Interactive Scientific Computing. Computing in Science
Engineering 9, 21–29. doi:10.1109/MCSE.2007.53.
Piqueux, S., Byrne, S., Kieffer, H.H., Titus, T.N., Hansen, C.J., 2015. Enumeration of Mars years and seasons since
the beginning of telescopic exploration. Icarus 251, 332–338. doi:10.1016/j.icarus.2014.12.014.
Piqueux, S., Byrne, S., Richardson, M.I., 2003a. Polygonal Landforms at the South Pole and Implications for Exposed
Water Ice, in: Sixth International Conference on Mars, p. 3275. URL: http://adsabs.harvard.edu/cgi-bin/
nph-data_query?bibcode=2003mars.conf.3275P&link_type=ABSTRACT.
74
A
rti
cl
ei
n
Pr
es
Piqueux, S., Byrne, S., Richardson, M.I., 2003b. Sublimation of Mars’s southern seasonal CO2 ice cap and the
formation of spiders. Journal of Geophysical Research 108, 5084. doi:10.1029/2002JE002007.
Pommerol, A., Portyankina, G., Thomas, N., Aye, K.M., Hansen, C.J., Vincendon, M., Langevin, Y., 2011. Evolution
of south seasonal cap during Martian spring: Insights from high-resolution observations by HiRISE and CRISM
on Mars Reconnaissance Orbiter. Journal of Geophysical Research 116, E08007. doi:10.1029/2010JE003790.
Robbins, S.J., Antonenko, I., Kirchoff, M.R., Chapman, C.R., Fassett, C.I., Herrick, R.R., Singer, K., Zanetti, M.,
Lehan, C., Huang, D., Gay, P.L., 2014. The variability of crater identification among expert and community crater
analysts. Icarus 234, 109–131. doi:10.1016/j.icarus.2014.02.022.
Sauermann, H., Franzoni, C., 2015. Crowd science user contribution patterns and their implications. Proceedings of
the National Academy of Sciences 112, 679–684. doi:10.1073/pnas.1408907112.
Schwamb, M.E., Aye, K.M., Portyankina, G., Hansen, C., Lintott, C.J., Allen, C., Allen, S., Calef, F.J., Duca, S.,
McMaster, A., R. M Miller, G., 2017a. Discovery of araneiforms outside of the South Polar Layered Deposits, p.
422.05. URL: http://adsabs.harvard.edu/abs/2017DPS....4942205S.
Schwamb, M.E., Aye, K.M., Portyankina, G., Hansen, C.J., Allen, C., Allen, S., Calef, F.J., Duca, S., McMaster,
A., Miller, G.R.M., 2017b. Planet Four: Terrains – Discovery of araneiforms outside of the South Polar layered
deposits. Icarus doi:10.1016/j.icarus.2017.06.017.
Schwamb, M.E., Lintott, C.J., Fischer, D.A., Giguere, M.J., Lynn, S., Smith, A.M., Brewer, J.M., Parrish, M., Schawinski, K., Simpson, R.J., 2012. Planet Hunters: Assessing the Kepler Inventory of Short-period Planets. The
Astrophysical Journal 754, 129. doi:10.1088/0004-637X/754/2/129.
Simpson, R.J., Povich, M.S., Kendrew, S., Lintott, C.J., Bressert, E., Arvidsson, K., Cyganowski, C., Maddison, S.,
Schawinski, K., Sherman, R., Smith, A.M., Wolf-Chase, G., 2012. The Milky Way Project First Data Release: A
bubblier Galactic disc. Monthly Notices of the Royal Astronomical Society 424, 2442–2460. doi:10.1111/j.
1365-2966.2012.20770.x.
Smith, D.E., Zuber, M.T., Neumann, G.A., 2001. Seasonal Variations of Snow Depth on Mars. Science 294, 2141–
2146. doi:10.1126/science.1066556.
Smith, I.B., Spiga, A., Holt, J.W., 2015. Aeolian processes as drivers of landform evolution at the South Pole of Mars.
Geomorphology 240, 54–69. doi:10.1016/j.geomorph.2014.08.026.
Thomas, N., Hansen, C.J., Portyankina, G., Russell, P.S., 2010. HiRISE observations of gas sublimation-driven
activity in Mars’ southern polar regions: II. Surficial deposits and their origins. Icarus 205, 296–310. doi:10.
1016/j.icarus.2009.05.030.
Thomas, N., Portyankina, G., Hansen, C.J., Pommerol, A., 2011. Sub-surface CO2 gas flow in Mars’ polar regions:
Gas transport under constant production rate conditions. Geophysical Research Letters 38, L08203–n/a. doi:10.
1029/2011GL046797.
Willett, K.W., Lintott, C.J., Bamford, S.P., Masters, K.L., Simmons, B.D., Casteels, K.R.V., Edmondson, E.M.,
Fortson, L.F., Kaviraj, S., Keel, W.C., Melvin, T., Nichol, R.C., Raddick, M.J., Schawinski, K., Simpson, R.J.,
Skibba, R.A., Smith, A.M., Thomas, D., 2013. Galaxy Zoo 2: Detailed morphological classifications for 304 122
galaxies from the Sloan Digital Sky Survey. Monthly Notices of the Royal Astronomical Society 435, 2835–2860.
doi:10.1093/mnras/stt1458.
Zachte, E., 2012.
Wikipedia Statistics Tables English.
URL: http://stats.wikimedia.org/EN/
TablesWikipediaEN.htm.
75