[go: up one dir, main page]

A publishing partnership

The following article is Open access

ABYSS. I. Targeting Strategy for the APOGEE and BOSS Young Star Survey in SDSS-V

, , , , , , , , , , , , , , , , , and

Published 2023 May 3 © 2023. The Author(s). Published by the American Astronomical Society.
, , Citation Marina Kounkel et al 2023 ApJS 266 10 DOI 10.3847/1538-4365/acc106

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0067-0049/266/1/10

Abstract

The fifth iteration of the Sloan Digital Sky Survey is set to obtain optical and near-infrared spectra of ∼5 million stars of all ages and masses throughout the Milky Way. As a part of these efforts, APOGEE and BOSS Young Star Survey (ABYSS) will observe ∼105 stars with ages <30 Myr that have been selected using a set of homogeneous selection functions that make use of different tracers of youth. The ABYSS targeting strategy we describe in this paper is aimed to provide the largest spectroscopic census of young stars to date. It consists of eight different types of selection criteria that take the position on the H-R diagram, infrared excess, variability, as well as the position in phase space in consideration. The resulting catalog of ∼200,000 sources (of which a half are expected to be observed) provides representative coverage of the young Galaxy, including both nearby diffuse associations as well as more distant massive complexes, reaching toward the inner Galaxy and the Galactic center.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

The Sloan Digital Sky Survey (SDSS) has obtained spectra of hundreds of thousands of stars, both within the Milky Way and beyond. Most of these are evolved stars; in particular red giants have historically been favored for the observations due to the ability to use them as tracers of stellar populations at large distances. However, SDSS has observed spectra of thousands of young stellar objects (YSOs) through its auxiliary programs (Román-Zúñiga et al. 2023).

During SDSS-III, young stars were targeted by the IN-SYNC program (Cottaar et al. 2014). The targeting strategy focused on known members of nearby populations visible from the Northern Hemisphere and containing large concentrations of young stars within the field of view of the telescope. The latter condition was required to fill a large fraction of the available spectral fibers. These included NGC 1333 (Foster et al. 2015), IC 348 (Cottaar et al. 2015), Orion A molecular cloud (Da Rio et al. 2016), as well as NGC 2264. In total, spectra of ∼3600 YSOs were taken with the APOGEE spectrograph. The selection strategy was based on existing catalogs of members, and thus yielded samples without significant contamination, but cannot be considered homogeneous nor complete.

SDSS-IV expanded its footprint in the targeting of YSOs and complemented previous data by observing the Orion Complex (Cottle et al. 2018), the Taurus Molecular Clouds, Upper Sco, W3/W4/W5 clusters, Cygnus X, Rosette Nebula, Carina Nebula, more evolved clusters such as the Pleiades and α Per, and others (for a complete overview of the regions observed, see Román-Zúñiga et al. 2023). With respect to SDSS-III, a greater emphasis was given to targeting sources in a more homogeneous manner, for instance by using color cuts or photometric variability to select sources rather than relying on existing confirmation of their youth. However different selection criteria were developed for each individual region, because the observations fell under the auspices of various programs (Beaton et al. 2021; Santana et al. 2021; Román-Zúñiga et al. 2023). Spectra of >30,000 stars were taken across the plates covering these regions. Since many of these targets were selected prior to the release of parameters from the Gaia mission, a significant fraction of them is evolved field stars, with the actual census of young stars within them being <10,000.

With the transition to SDSS-V, several changes have been implemented to the survey strategy.

  • 1.  
    All sky accessibility. The spectrographs utilized by the survey are installed on two telescopes in the Northern and Southern Hemispheres.
  • 2.  
    Fast instrument reconfiguration. Rather than using predrilled plates to position the fibers of the spectrographs, robotic fiber positioners are now used instead (Pogge et al. 2020). This allows both a greater flexibility in targeting, and minimizing operational overheads. It is now possible to create a comprehensive sample of spectra of the stars across the Galaxy, without necessarily being limited to a particular line of sight or field of view.
  • 3.  
    Prioritization. The young star program is no longer considered auxiliary; rather it is now one of the core programs of the survey.
  • 4.  
    Improved selection function. A homogeneous targeting strategy across the entire sky would improve the subsequent modeling of the sample. We can no longer treat individual star-forming regions independently. Regardless of the distance to a particular population, or the number of stars within it, a simple selection of all young stars in the solar neighborhood is needed.

However, devising a homogeneous targeting strategy is rather arduous, as, depending on their mass and age, young stars have a great degree of variety in the observational signatures that could be used to confirm their youth. As such there is no one single criterion that can uniformly select all young stars at different evolutionary stages across all masses and distances. Driven by these exigencies, we are forced to develop more sophisticated target strategies than previously implemented.

In this paper, we provide an overview of the criteria used to target the young star in SDSS-V, how it has evolved to date, and the general observation strategy. We also give a brief overview of the data collected during the first year of operations, which began in 2021.

2. ABYSS Overview

The APOGEE and BOSS Young Star Survey (ABYSS) is the name of the young star program in SDSS-V. This program is set to produce optical and near-IR spectra of ∼105 young stars across the entire sky. The primary goals for these data include (but are not limited to) the following:

  • 1.  
    The first goal involves characterizing the dynamical, spatial, and temporal structure of individual star-forming regions, how these populations evolve over time.
  • 2.  
    The second goal involves tracing of the global structure traced by the young stars, from the solar neighborhood as a whole, to the Galactic scales. This also involves characterizing the spiral arm structure, and the kinematic and dynamic properties of the young Milky Way disk.
  • 3.  
    The third goal involves examining the connection between young stars and gas from which they have formed.
  • 4.  
    The fourth goal involves measuring the fundamental stellar properties of young stars, their comparison to the state-of-the-art models of stellar structure and evolution. This also involves examining the role of the environment in which a star is born on these properties.
  • 5.  
    The fifth goal involves characterizing multiplicity and orbital parameters of young stars across different populations.

In this section, we present the underlying logistics behind ABYSS, including survey structure, data acquisition strategy, and target selection.

2.1. Survey Strategy

SDSS-V utilizes two 2.5 m telescopes; one is located in the Northern Hemisphere at the Apache Point Observatory (APO), and the second one is located in the Southern Hemisphere at Las Campanas Observatory (LCO; Bowen & Vaughan 1973; Gunn et al. 2006; Blanton et al. 2017). Both of these telescopes have two multiobject fiber spectrographs: APOGEE and BOSS. APOGEE covers H band, with the wavelength of 1.51–1.7 μm with R ∼ 22,500 (Wilson et al. 2010; Majewski et al. 2017; Wilson et al. 2019). BOSS is an optical spectrograph, covering wavelength range of ∼3600–10400 Å, with R ∼ 1800 (Smee et al. 2013). A total of 300 APOGEE fibers and 500 BOSS fibers can be placed simultaneously in 3° field of view at APO, and 2° at LCO, with the fiber diameter of ∼2'' and ∼1farcs3 respectively between these observatories.

With the new capability of SDSS to rapidly reconfigure the fiber placement due to the robotic fiber positioners (Sayres et al. 2021), and the ambitious goals of a comprehensive survey obtaining spectra across entire sky (Kollmeier et al. 2017), the exposure time on all fields is set at 15 minutes.

SDSS-V divides its efforts into three mappers: Milky Way Mapper (MWM), Black Hole Mapper, and Local Volume Mapper. All the programs aiming to obtain stellar spectra, including ABYSS, are a part of MWM. Each program within each mapper can define one or several cartons: a subset of stars selected by a particular set of criteria that share the instrument configuration, requirements on cadence, and number of epochs.

ABYSS has defined eight cartons of stars to be observed with APOGEE, of which five cartons are considered to be optically bright to also be observed with BOSS. These definitions are described in Section 2.2. A significant fraction of targeted sources (∼66%) is observed by both instruments. This offers several advantages: precise subkilometers per second radial velocities (RVs) from high-resolution APOGEE spectra, versus various lines that can clearly inform on stellar youth (such as Li i and Hα) that can be accessed with BOSS. The two instruments have different faint limits set by the following program: H < 13 mag for APOGEE and GRP < 15.5 mag for BOSS. The faint limit is set at the typical magnitude that would reach signal-to-noise ratio of 30 in a coadded spectrum of three 15 minute exposures, which was deemed sufficiently high to extract the fundamental stellar parameters from the spectra.

These limits apply for all of the defined cartons. As some of the YSOs are too faint to be detected in the optical regime due to extinction, some of the cartons can only be observed with APOGEE. Individual stars can also meet the faint limit in optical, or in infrared, but not both; and so, ∼20% of the stars for either spectrograph are unique, with the remaining ∼80% being targeted by both instruments.

In total for ABYSS targets, a complete set of observations requires three APOGEE and three 17 BOSS epochs for a given object, brightness limits permitting. Such a number of exposures is needed to confirm multiplicity or variability within a spectrum. No firm constraints on the cadence of observations have been imposed.

2.2. Carton Definitions

In this subsection, we present eight independent definitions for cartons that were used to target sources for observations as part of the ABYSS program (Figure 1). These cartons rely on various criteria, such as infrared excess, position on the H-R diagram, photometric variability, and membership of moving groups/clusters, etc.

Figure 1.

Figure 1. Spatial distribution of sources in Galactic coordinates for each of the cartons.

Standard image High-resolution image

The initial set of observations conducted during the first 6 months of operations, still using plug plates (see Section 3), relied on the V0 version of the targeting. With the instrument upgraded to robotic positioners, the targeting criteria were updated to V0.5. Transition to V1 will occur in 2023. Data release 18 makes available V0.5 version of the targeting, which is the focus of this paper. However, for the sake of the historical record, the full evolution of the selection is described.

2.2.1. Disk

Historically, the YSOs that have been easiest to identify are those that have large infrared excess due to the presence of a protoplanetary disk, particularly in the mid-IR regime. In the last few decades, telescopes such as Spitzer and Wide-field Infrared Survey Explorer (WISE) have particularly expanded the census of dusty YSOs. In particular, WISE, due to being an all-sky survey, is particularly informative for targeting. Several studies have used WISE to search for YSOs (e.g., Koenig & Leisawitz 2014; Marton et al. 2016; Kang et al. 2017), but they either focused only on a specific star-forming region or had a large degree of contamination across the entire sky.

As at this stage in the survey, the goal is to create a census of sources that should be targeted for follow-up observations (rather than explicit classification of YSOs into evolutionary stages), thus we use simple color cuts in WISE photometry for this carton. To minimize a selection of very distant, highly extincted field stars (which are the main source of contamination), we impose a parallax cut as well—this implicitly requires all of the identified sources to be bright enough in the optical regime to be detected and have reliable astrometry with Gaia. We select sources satisfying the following:

  • 1.  
    W1 − W2 > 0.25 mag;
  • 2.  
    W2 − W3 > 0.5 mag;
  • 3.  
    W3 − W4 > 1.5 mag;
  • 4.  
    π > 0.3 mas.

These cuts have been evaluated against known dusty YSOs in the Orion molecular clouds (Megeath et al. 2012), and they are shown in Figure 2.

Figure 2.

Figure 2. Criteria for the Disk and Embedded cartons, showing color–color diagrams of disk-bearing dusty YSOs from Megeath et al. (2012) in Orion A and B molecular clouds, used as a reference to constrain the targeting selection. Sources in yellow are those with G < 18.5, π > 0.3 mas, used as a template to map the Disk carton; the relevant color cuts to select this carton are shown as red solid lines. Sources in blue are optically faint, representative of the Embedded carton; the color cuts are shown in red dashed lines. Bottom right panel shows the data for sources obtained as part of ABYSS observations with APOGEE, separating the sources into likely YSOs and the contaminating red giants through $\mathrm{log}g$ > 3.2 cut. The black line shows the color cut introduced for the Embedded carton in the V1 targeting, as prior to this it had significant contamination.

Standard image High-resolution image

ALLWISE photometry was used for the selection. Although there have been recent rereductions, such as unWISE or neoWISE, their improvements are primarily in W1- and W2-band photometry; W3 and W4 mostly remain as is. Longer wavelength bands lack the sensitivity of shorter wavelength bands; furthermore, they have not been observed for as long due to WISE running out of cryogenic coolant needed to suppress the telescope emission at these wavelengths. Nonetheless, W3 and W4 bands are critical for reliably identifying dusty disk-bearing stars. As such, by requiring these bands and using merged photometry, any improvements in W1 or W2 have negligible effect on the selection.

In the V0 version of the targeting, these cuts formed the basis of YSO_S1 carton. In version V0.5, the carton was renamed and split into YSO_Disk_APOGEE and YSO_Disk_BOSS, containing 28,832 and 37,478 stars respectively. In version V1, Gaia Data Release 2 (DR2) astrometry was upgraded to EDR3.

2.2.2. Embedded

Some disk-bearing sources are too faint to have reliable Gaia parallaxes. This is usually the case for class I protostars that are still embedded in their natal envelopes, class II YSOs that have edge-on disks, or the sources that are more distant and thus have more extinction along the line of sight. Without a distance estimate, it can be difficult to separate bona fide YSOs from distant and heavily extincted red giants. As such, more stringent color cuts, not just on ALLWISE photometry but also Two Micron All Sky Survey (2MASS), are required:

  • 1.  
    G > 18.5 mag, or undetected;
  • 2.  
    JH > 1 mag, HK > 0.5 mag;
  • 3.  
    W1 − W2 > 0.5 mag, W2 − W3 > 1 mag, W3 − W4 >1.5 mag;
  • 4.  
    W3 − W4 > 0.8 × (W1 − W2) + 1.1 mag.

These cuts are shown in Figure 2.

In V0 version of targeting, the carton was referred to as YSO_S2; in V0.5 it was renamed as YSO_Embedded_APOGEE, containing 11,086 stars. Following the first year of operations, the carton was re-examined; approximately half of the observed sources were red giants, as shown in Figure 2 (bottom right panel). Therefore in V1 we added an additional cut

  • 1.  
    HK > 0.65 × (JH) − 0.25 mag

to minimize the contamination by evolved stars and further restrict the selection to the parameter space that is most commonly inhabited by spectroscopically confirmed YSOs, limiting the sample to 5455 stars.

2.2.3. Nebula

The selection criteria of disk-bearing and embedded sources rely on WISE bands W3 and W4. W3 and W4 however become less effective in regions of high nebulosity, as they saturate, producing gaps in the coverage. To fill them, we used shorter wavelength data, using a set of criteria that is tuned to autonomously find such nebulous regions:

  • 1.  
    If W4 is not reported, W2 − W3 > 4 mag.
  • 2.  
    If W3 and W4 are not reported, JH > 1.1 mag.

This preferentially selects sources found in gaps of the previous cartons in discrete regions on the sky, such as, e.g., in the center of the Orion Nebula (Figure 3).

Figure 3.

Figure 3. Spatial distribution (in Galactic coordinates) of the selected sources in Disk, Embedded, and Nebula cartons toward the Orion Nebula Cluster. Note that the sources in the Nebula carton fill in a gap in the other two cartons.

Standard image High-resolution image

Additionally, some of the sources that are selected by these cuts are found off of the Galactic plane and/or away from known star-forming regions. This creates an excess of targets in a narrow line following the scanning law of WISE telescope; and so, these sources appear to be suspect. Thus, we also required b < 5° and a combination of b > − 5° or l > 180°, to exclude this contamination.

This carton was introduced in V0 as YSO_S2.5. In V0.5 it was renamed as YSO_Nebula_APOGEE, containing 1112 stars.

2.2.4. CMZ

The inner Galaxy, including the central molecular zone (CMZ), has been surveyed with Spitzer as a part of GLIMPSE and MIPSGAL programs (Carey et al. 2009; Churchwell et al. 2009; Gutermuth & Heyer 2015). These data offer a substantial improvement on sensitivity and resolution in comparison to WISE; therefore they are advantageous in targeting stars outside of the solar neighborhood.

Using the properties of massive YSOs toward the Galactic center identified by An et al. (2011), we select a sample of candidates with the following:

  • 1.  
    [8.0] − [24] > 2.5 mag, i.e., very red sources, using photometry from Gutermuth & Heyer (2015);
  • 2.  
    π < 0.2 mas or not detected and/or measured, to ensure the sources are distant.

In V0 of targeting, YSO_CMZ carton also imposed a spatial limit of 358 < l < 2° and −1 < b < 1° to focus solely on the CMZ. In V0.5, the carton was renamed to YSO_CMZ_APOGEE and removed the spatial restriction, allowing the sources from the entire MIPSGAL footprint. This enables to identify YSO candidates across the inner Galaxy, containing 13,170 stars.

2.2.5. Variable

On average, YSOs tend to be more variable than main-sequence stars (e.g., Kounkel et al. 2022b). Part of the reason for this variability is the presence of stronger magnetic fields that leads to more prominent star spots with a larger filling factor. Also the presence of protoplanetary disks that would occult the photosphere, and accretion events can lead to an increase in brightness.

To estimate variability Vx , we use multiepoch photometry by Gaia. Following Belokurov et al. (2017), we define the photometric variability in a given filter x as follows:

where x corresponds to the Gaia G, GBP, or GRP bands, phot_x_n_obs is the number of observations that contributed to the photometry in a given band, and phot_x_mean_flux_over_error is the mean flux in a given band divided by its error. For strongly variable sources, the photometric uncertainty is comparable to the amplitude of variability.

Using the list of members of the Orion Complex from Kounkel et al. (2020) as a representative sample of young stars, we develop a set of criteria based on variability that allows to most cleanly preserve a large fraction of members while rejecting field stars within the volume of space surrounding Orion. This set of criteria was later applied to the entire sky and further modified to preserve the morphology of nearby star-forming regions and minimize contamination, resulting in the following:

  • 1.  
    VG > 0.02, VBP > 0.02, VRP > 0.02; the reference YSOs tend to be more variable than the field stars.
  • 2.  
    ${V}_{G}^{0.75}\lt {V}_{\mathrm{BP}}\lt {V}_{G}$, $0.75{V}_{G}\lt {V}_{\mathrm{RP}}\lt {V}_{G}^{0.95}$; correlation of variability in different bandpasses on the order of unity appears to be a strong indicator of YSOs. On the other hand, while YSOs with different correlation in variability do exist, it is difficult to reliably separate them from the field stars, and so, many young variable stars may be excluded from the selection. The slopes of these power laws were determined from examining Figure 4.
  • 3.  
    GBPGRP > 1.3; hot stars do not have convective atmospheres, and thus they generally do not have spotted photospheres. While this specific cut is redder than that of the convective limit, it also minimizes the extreme contamination from the red giants. Variability among hotter stars may often be an indicator of other processes, such as, e.g., eclipsing binaries, which is not an indicator of youth.
  • 4.  
    ${M}_{\mathrm{BP}}\gt 5{\mathrm{log}}_{10}{V}_{\mathrm{BP}}+11$ preferentially excludes the evolved subgiants, despite some overlap in the parameter space with bona fide YSOs.
  • 5.  
    2.5(GBPGRP) − 1 < MG < 2.5(GBPGRP) + 2.5; previous criteria preferentially select YSOs, but it still includes some strongly variable main-sequence stars. This selection confines the parameter space on the H-R diagram to minimize contamination.
  • 6.  
    π > 0.3 mas, G < 18.5, H < 13, limiting the selection to the sources to brighter stars with reliable parallaxes. Note that the H-band cut is applied both to APOGEE and to BOSS samples, as sources with H > 13 and GRP < 15.5 preferentially trace out more distant stars that seem to be more strongly contaminated than the sources within 1 kpc. As such, the BOSS variable sample is a subset of the APOGEE variable sample, with the faint limit in place.

These cuts are shown in Figure 4.

Figure 4.

Figure 4. Criteria for the selection of stars in the Variable carton. Top panels show variability distribution in the sample of stars toward the Orion Complex, with known members highlighted in blue, and field stars shown in yellow. The red lines show the cuts to the sample based on the correlation of variability on the order of unity described in the text. Bottom panel shows the full sample of stars within 500 pc that meets the minimum variability and variability correlation cuts (see Section 2.2.5), showing the sources that have been selected as YSO candidates in blue, and the other variable stars in yellow.

Standard image High-resolution image

The carton based on this selection has been introduced in V0 as YSO_S3. In V0.5 it has been renamed and split into YSO_Variable_APOGEE and YSO_Variable_BOSS, containing 52,691 and 47,758 stars respectively. In V1, the selection has been upgraded from Gaia DR2 data onto Gaia EDR3.

We note that the third Gaia data release (Gaia DR3) has produced a catalog of young star candidates based on their variability (Marton et al. 2022). The selection presented here was originally derived prior to the availability of these data. Out of 79,375 sources presented in Gaia DR3, our selection has 3963 stars in common with this carton, and 16,474 stars across all of the cartons. Of the remaining 20,225 candidates in Gaia DR3 that would meet our faint limit, as much as a half appear to be contamination from highly reddened distant main-sequence and red giant stars, but in future versions of the targeting definition, with some refinement, it may be possible to take advantage of this catalog. On the other hand, our current selection does not extend to as faint magnitudes (and, indeed, applying the same criteria to fainter stars does appear to significantly increase contamination), but it does appear to have greater sensitivity to the populations with ages of up to a few tens of megayears.

2.2.6. Cluster

Most of the selection criteria devised in this work preferentially target low-mass YSOs, still in the pre-main-sequence (PMS) phase of stellar evolution. Young late B, A, and F stars reach the main sequence quickly and become difficult to separate from field stars using conventional photometric selection criteria. To identify them, it is however possible to take advantage of the fact that young stars generally form in large associations, typically with hundreds or thousands of other members. Young populations tend to be dynamically cold, with velocity dispersion of less than a few kilometers per second. Thus, it is possible to find young moving groups by performing a clustering analysis in position and velocity phase space. As a selection of likely members using this method does not have a dependence on the spectral type, it makes possible to include young B, A, and F stars alongside later type stars.

Kounkel et al. (2020) have applied hierarchical clustering on Gaia DR2 data within 3 kpc of the Sun. The initial data selection consisted of π > 0.2 mas, −30 < b < 30°, ${v}_{\alpha ,\delta }^{\mathrm{lsr}}\lt 60$ km s−1, as well as additional cuts based on astrometric and photometric quality. The clustering was performed with HDBSCAN (Campello et al. 2013) in several slices in distance and then stitched together. In total more than 8000 moving groups were identified consisting of ∼1 million stars. The ages of the identified moving groups were estimated through an isochrone fitting using a neural net Auriga (Kounkel et al.2020).

We selected all sources in the moving groups with an age $\mathrm{log}t(\mathrm{Myr})\lt 7.5$. The resulting subset forms the basis of YSO_Cluster carton, introduced in V0 version of targeting, and split into YSO_Cluster_APOGEE and YSO_Cluster_BOSS in V0.5, containing 45,461 and 59,065 stars respectively.

Older populations, with $\mathrm{log}t(\mathrm{Myr})\gt 7.5$, are being considered by the survey as a part of an open fiber program, but only as targets of opportunity, with a single epoch obtained with either BOSS or APOGEE, and are not included among the core programs of the survey.

2.2.7. PMS

There have been several dedicated studies that focused on identifying PMS stars using an H-R diagram generated through Gaia photometry and astrometry. We incorporate the resulting catalogs from two such works.

Zari et al. (2018) have selected low-mass PMS stars using Gaia DR2 data within 500 pc that are found above (and therefore are younger than) the 20 Myr PARSEC isochrone (Marigo et al. 2017) and fainter than MG > 4 mag. The photometry has been first extinction corrected, excluding sources with AG < 0.92 mag, to avoid reddened field stars. Furthermore, additional cuts have been made to select stars with low parallax errors (σπ /π < 20%), to limit the sample to disk stars (total tangential velocity $\sqrt{{v}_{l}^{2}+{v}_{b}^{2}}\lt 40$ km s−1), and to produce a "clean" H-R diagram as suggested by (Lindegren et al. 2018, Appendix C).

This selection is effective for nearby populations, but at larger distances, the sample becomes strongly contaminated by field stars, both due to an imperfect match of real photometry to the isochrones, and due to imperfections in the extinction correction. An alternative approach was considered by McBride et al. (2021), using a neural network Sagitta. It was trained on Gaia and 2MASS photometry of stars in young populations from Kounkel et al. (2020) to autonomously identify low-mass PMS stars, automatically adjusting the color–magnitude threshold based on age and distance of a star. The constructed sample extended up to π > 0.2 mas. It consisted of the sources with a classification probability >70%, and it also had several data quality criteria, such as σπ /π < 10% or σπ < 0.1 mas, precision in Gaia photometry in all bands <10%, recommended cuts based on the photometric excess noise, as well as Gaia RUWE<1.4.

These two catalogs form the basis of YSO_PMS_APOGEE and YSO_PMS_BOSS cartons that were first introduced in V0.5 version of targeting, containing 76,332 and 73,213 stars respectively. Initially, both catalogs used Gaia DR2 data; in V1 the catalog from McBride et al. (2021) was upgraded to EDR3 version, which (due to magnitude limits) did not change substantially, except for minor improvements in sensitivity to more distant PMS stars.

2.2.8. OB

Most of the YSO cartons (with exception of Cluster and CMZ) focus exclusively on low-mass YSOs, as they are most distinct from field stars. Intermediate and massive stars, on the other hand, reach the main sequence very quickly, and thus become difficult to differentiate.

However, as OB stars have short lifetimes, they would always be young. Thus, to fill the gap in targeting, we selected sources based on examining the placement of known OB stars from Maíz Apellániz et al. (2016):

  • 1.  
    −0.2 < (GBPGRP) < 1.1;
  • 2.  
    MG < 1.6(GBPGRP) − 2.2;
  • 3.  
    G < 18 mag, π > 0.3 mas.

In V0, this selection formed the basis of YSO_OB carton, which was split into YSO_OB_APOGEE and YSO_OB_BOSS, both containing 8670 stars. However, as all of the selected stars are very bright (typically G < 12 mag), they currently cannot be observed with BOSS without offsetting their positions, due to the saturation limit of the instrument.

In V0.5, these cartons were rendered obsolete, as all of the targets that are a part of YSO_OB are a perfect subset of OBA_CORE program within SDSS-V (Zari et al. 2021); thus, they do not require duplication of efforts from multiple programs.

2.3. Sample Summary

The map showing the spatial distribution of all stars in all cartons is shown in Figure 1. Unsurprisingly, almost all sources are found along the Galactic plane and Gould's Belt. The angular scale height of the disk between the cartons strongly depends on the typical distance of the stars within it. Of the optically bright sources, YSOs in the vicinity of the solar neighborhood dominate the PMS carton, while the sources found beyond >1 kpc dominate the OB carton (Figure 5).

Figure 5.

Figure 5. Distribution of distances toward the stars in the optically bright cartons.

Standard image High-resolution image

In total, V0.5 targeting sample consists of 202,726 sources to be observed with APOGEE, and 196,188 sources to be observed with BOSS. There is some overlap between the cartons, but in general, most of the sources in each carton are unique, sensitive to distinct tracers of youth (Figure 6). For example, only 10% of stars in the Disk carton are also found in PMS carton. This is partially due to PMS carton requiring high-precision photometery and astrometry, which can often be poor in stars with disks, even in the optically bright stars. Furthermore, PMS carton is primarily sensitive to the nearby stars, whereas the Disk carton can include many more distant stars along the plane of the disk. Similarly, only 25% of stars in the Cluster carton are also found in PMS carton. Clustering is unbiased to Teff, and so the Cluster carton includes many more high-mass stars than what can be selected as a high-fidelity PMS star based on photometry alone. On the other hand, clustering preferentially selects regions with high stellar density, and it often struggles recovering more diffuse groups, such as, e.g., Taurus, or outer parts of the populations with a strong density gradient, or older groups that are starting to lose dynamical coherence. As such, all of the targeting approaches are highly complementary to one another.

Figure 6.

Figure 6. A Venn diagram showing relative sizes of all the various cartons in the ABYSS program, as well as the largest overlap in targets between them. Note that, due to a large number of cartons, overlap between some of the cartons cannot be shown.

Standard image High-resolution image

The catalog of sources is available as a part of SDSS DR18 (Almeida et al. 2023). It can be accessed through SkyServer 18 using an ADQL query:

  • SELECT TOP 1000
  • mc.carton,mt.ra,mt.dec
  • FROM mos_carton mc
  • JOIN mos_carton_to_target mctt ON
  • mc.carton_pk=mctt.carton_pk
  • JOIN mos_target mt ON
  • mt.target_pk=mctt.target_pk
  • WHERE CHARINDEX("yso'', mc.carton) > 0

This will return the first 1000 sources; an individual source would be included multiple times for every single carton in which it appears.

3. First Year Data

In this section, we describe the data that have been obtained in the course of the first few months of SDSS-V operations as well as the available archival data for the ABYSS targets. We give an overview of the data processing pipelines that are currently available to process these data, and their fidelity.

3.1. Planning of Observations

Although SDSS-V began its operations in 2021, it did not immediately reach its full capability. As the transition to the robotic fiber positioning system in both hemispheres required significant instrument upgrades, initial observations at APO were carried out with the SDSS-IV plug-in plate system (Wilson et al. 2019) during the first 6 months of operations. As young stars can often be found inside compact clusters, and the plug plates can pack the fibers closer together than the robot positioners, such a configuration was deemed advantageous for ABYSS. Thus, several plates were commissioned to target fields with the highest density of sources that had not been targeted in previous iterations of the survey. Additionally, YSO targets were included in the 2021 plates led by other programs.

In the previous iterations of the survey, BOSS was primarily used in dark time to look at faint targets. As such, to avoid saturation for a 15 minutes exposure, the nominal bright limit for BOSS is G > 13 mag, whereas for APOGEE is H > 7 mag. In the near future, such limits will be overcome through offsetting the fiber position from the position of a star. Meanwhile, for the first year plate operations, it was decided that ABYSS targets would be split: sources brighter than GRP < 13 mag would be observed with APOGEE, and fainter stars would be observed with BOSS. At the moment, the split is suboptimal, due to a resulting segregation in mass. However, once instrumental limitations are lifted, the survey is expected to obtain a complementary set of observations to compensate this problem.

The SDSS-V plate program has obtained spectra for 2462 ABYSS targets with APOGEE, and 4854 targets with BOSS. These data can be supplemented by 7884 sources for which archival APOGEE spectra exists as part of DR17 (Abdurro'uf et al. 2022). These include the young stars targeted explicitly by SDSS III and IV, and those that have been serendipitously targeted by other programs (Figure 7). In these data, 193 sources have both APOGEE and BOSS spectra.

Figure 7.

Figure 7. Distribution of ABYSS sources that have been observed with SDSS through 2021. Stars in yellow have been observed with APOGEE prior to the beginning of SDSS-V. Sources in blue are those that have been observed as a part of SDSS-V plate program, typically with both APOGEE and BOSS fibers assigned in a given field.

Standard image High-resolution image

3.2. APOGEE

Due to the extensive history of observations, various pipelines have been developed to measure the stellar parameters (such as Teff and $\mathrm{log}g$) from APOGEE spectra (Cottaar et al. 2014; Olney et al. 2020; Sprague et al. 2022). The latest iteration of these pipelines, APOGEE Net, can extract parameters for all stars with Teff > 3000 K in a self-consistent manner, and for PMS stars, its $\mathrm{log}g$ is sensitive to age. Thus, using the latest iteration of the APOGEE Net pipeline, we qualitatively evaluate how prone is each individual carton to field contamination from red giants (Figure 8). A full quantitative assessment that also considers contamination from the older main-sequence stars will be presented in future papers in this series.

Figure 8.

Figure 8. Spectroscopic parameters extracted by APOGEE Net from the APOGEE spectra for new and archival ABYSS targets, highlighting the typical temperature range encompassed by each carton, as well as a typical rate of contamination. Note that most sources occupy the parameter space expected by the PMS stars, with only minor contamination from the red giants in most cartons.

Standard image High-resolution image

Optically bright cartons (PMS, Variable, Cluster, Disk, and OB) have only minimal contamination from red giants (∼4%) relative to the total number of sources. Approximately half of the sources in the Embedded carton are red giants. The bulk of this contamination can be reduced in the future through the 2MASS color cut introduced in V1 version of targeting (Figure 2). The contamination rate for other two optically faint cartons, CMZ and Nebula, is difficult to evaluate at the moment due to the limited size of the samples, as existing SDSS-V observations do not cover the region of space where the bulk of these targets resides. Similarly, archival DR17 observations impose their own targeting selections that may favor a particular type of sources, and so they may not necessarily be representative of a full set of sources for these cartons. E.g., as during SDSS-III and -IV, the primary targets for the survey were stars that have photometry consistent with being red giants, which may make the red giant contaminants to be overrepresented in the sample observed prior to SDSS-V.

Different cartons favor different temperature ranges. The PMS and Variable cartons typically select subsolar systems: they are designed for low-mass stars that are convective and are slow to reach the main sequence. Disk and Embedded cartons also favor low-mass stars: the Embedded carton is likely to have fewer massive stars than the Disk carton, as higher-mass stars can remain optically bright through a higher degree of extinction. Nonetheless, low-mass stars have longer disk lifetimes (e.g., Bayo et al. 2012; Ribas et al. 2015). Sources in the Cluster carton are not based on stellar properties of individual stars, but rather on their membership, and so stars of all masses are represented in it. Finally, sources in OB carton favor sources with Teff > 10,000 K. In comparison to massive stars in the Cluster carton, the sources in the OB carton tend to skew toward hotter Teff and lower $\mathrm{log}g$ (on average by ∼0.1 dex).

Typically the lowest Teff observed in the DR17 data is lower than is currently available for SDSS-V data. As mentioned previously, this is due to preferentially observing brighter stars with APOGEE due to the saturation limit in BOSS.

3.3. BOSS

YSOs have a long history of being observed with APOGEE. On the other hand, fewer than 40 ABYSS targets have archival BOSS observations, most of them concentrated near 25 Ori (Suárez et al. 2017). Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) low-resolution spectroscopy is comparable to BOSS, both in wavelength coverage and spectral resolution. In LAMOST Data Release 7, spectra are available for 13,452 ABYSS targets, including several prominent star-forming regions. Despite this, to date only a few studies of YSOs utilized LAMOST spectra (e.g., Liu et al. 2021; Wang et al. 2022; J. Hernandez et al. 2023, in preparation).

As such, with the exclusion of RVs, currently there are no pipelines capable of extracting reliable stellar parameters from either BOSS or LAMOST spectra of low-mass YSOs. There are however existing efforts to rectify this, both by measuring reliable Teff and $\mathrm{log}g$ (e.g., L. Sizemore et al. in preparation), as well as characterizing a number of features found in the spectral range of the instrument that could be used as indicators of youth (such as Li i, or various emission lines for H or Ca; S. Saad et al. 2023, in preparation; Figure 9).

Figure 9.

Figure 9. An example of a BOSS spectrum of a young star. The inlay shows the Li i 6707.7 Å line.

Standard image High-resolution image

At the moment, BOSS spectra are processed by pyXCSAO (Kounkel 2022), which is a Python implementation of IRAF RVSAO package (Tonry & Davis 1979; Kurtz & Mink 1998). The spectra are cross-correlated against synthetic PHOENIX templates (Husser et al. 2013). Subgrid solutions are derived for parameters such as Teff and $\mathrm{log}g$ using the quality of the fit. However, there are significant systematics that affect the quality of the derived parameters for the young stars in particular, not dissimilar to what was observed in the original APOGEE YSO pipeline (Cottaar et al. 2014). RVs are measured from the best fitting template for stars with Teff>3500 K. RVs for cooler stars are determined from best fitting 3500 K template, as they otherwise appear to be systematically redshifted relative to the rest velocity of other stars in the same star-forming region. This has also been seen in the APOGEE data (e.g., Kounkel et al. 2019), due to a systematic issue in synthetic spectra of cool stars.

APOGEE can achieve subkilometers per second precision in its measured RVs. BOSS, being a lower-resolution instrument, can produce RV precision of only ∼4–5 km s−1 for spectra of low-mass stars with high signal-to-noise. Nonetheless, both instruments have been vetted to ensure consistent performance and a lack of a zero-point offset between them, both in the average properties derived for individual regions (Figure 10), and in the direct comparison of RVs of the individual stars, when possible (Figure 11).

Figure 10.

Figure 10. Velocity structure observed toward a field observed by ABYSS toward Cam OB1 association. Three kinematically coherent groups are located toward it, all found in a similar location on the sky, and at a similar distance. These groups are distinguishable in the proper-motion space (left), and in RV space (right, same colors). The same velocity structure is recovered both by APOGEE (thick unshaded curve) and BOSS (shaded curve).

Standard image High-resolution image
Figure 11.

Figure 11. Difference between APOGEE and BOSS RVs for the same stars, divided by BOSS uncertainties in the RVs. All stars, regardless of the evolutionary status, observed to date are shown in red. ABYSS-only targets are shown in blue. Note that the typical scatter is consistent within 1σ; wider wings may be attributable to spectroscopic binaries.

Standard image High-resolution image

We defer a more detailed analysis of the BOSS spectra and the parameters to subsequent publications in the series.

4. Gaia DR3 Comparison

The recent Gaia DR3 (Gaia Collaboration et al. 2022) has made available not just the astrometric parameters that have been present in Gaia DR2 or EDR3 but also RVs for 30 million stars derived with its onboard spectrograph (Katz et al. 2022), as well as stellar parameters such as Teff and $\mathrm{log}g$ for 5.5 million stars (Fouesneau et al. 2022). This census includes a number of ABYSS-targeted objects.

Unfortunately, these parameters were optimized to produce accurate solutions for typical stars. YSOs do not fall into this category, due to a number of unique spectral features they exhibit from both accretion and activity, especially near Ca ii triplet at ∼8500 Å, which is tightly encompassed in the spectral range of Gaia's Radial Velocity Spectrometer (RVS) spectrograph (Recio-Blanco et al. 2022). Thus, caution has to be utilized in interpreting the available parameters in this data release for the young stars.

In particular, Kounkel et al. (2022a) identified issues in Gaia RVs for YSOs: (1) they are not precise, as they have typical uncertainties of 5–10 km s−1, which is significantly worse than the instrumental limit at the typical signal-to-noise of these observations; (2) they are not as accurate as when evaluated against high-resolution RVs from existing APOGEE observations of spectroscopically stable young stars, they typically show a scatter >5σ (Figure 12). This issue persists for more than 100 Myr, as a large scatter in RVs among low-mass stars is detected in a (comparatively) older cluster such as the Pleiades.

Figure 12.

Figure 12. A comparison of RVs for young stars between APOGEE and Gaia DR3. Sources have been selected from Kounkel et al. (2019) for which at least three APOGEE epochs have been obtained to confirm their RV stability, excluding any of the spectroscopic binaries. The typical precision in RV is <1 km s−1 for APOGEE, and ∼6 km s−1 for Gaia.

Standard image High-resolution image

Similarly, we evaluate Teff and $\mathrm{log}g$ of the ABYSS stars that Gaia has observed, and compare them against the derived parameters from APOGEE (Figure 13). We find that the parameter space occupied by cool PMS stars (Teff < 5000 K, low $\mathrm{log}g$) is currently not well sampled by the pipelines employed by the Gaia consortium. The sources that occupy this parameter space are often recognized to have lower $\mathrm{log}g$ than what is commonly found in main-sequence stars, but they are pushed toward hotter Teff, which results in placing them toward the red giant branch. No correlation is found between the reported $\mathrm{log}g$ values for low-mass stars between these two data sets, however. Further, the Gaia-derived $\mathrm{log}g$ are not sensitive to stellar ages, unlike those from APOGEE (e.g., Olney et al. 2020; Kounkel et al. 2022c). Similar caution should be given to other derived parameters produced by Gaia for these stars, such as, for instance, ages.

Figure 13.

Figure 13. A comparison of APOGEE-derived Teff and $\mathrm{log}g$ for the ABYSS objects, vs. those in Gaia DR3.

Standard image High-resolution image

We stress that this issue is specific to young stars, most prominently on the low-mass end, and should not affect more evolved main-sequence stars that are older than a few hundred megayears. In future releases, the Gaia spectra may be reprocessed with a pipeline that is better tuned to YSOs, both within the collaboration or through efforts using the publicly released spectra. However, as Gaia DR4 is not expected until 2026 at the earliest, by which point ABYSS is expected to approach completion. As such, ABYSS will be the first comprehensive spectroscopic census of young stars across the entire sky.

5. Summary

We present the selection strategy of young stars across the entire sky targeted by SDSS-V. The target catalogs are released publicly as a part of data release 18. In total, this selection has resulted in a sample of ∼200,000 sources, down to the limiting magnitude of H < 13 mag or GRP < 15.5 mag, encompassing both diffuse associations as well as compact massive complexes of young stars across a range of distances, from the solar neighborhood to the inner Galaxy. In future years, either optical or near-infrared spectra (or both) are expected to be available for most of these sources.

The selection strategy for ABYSS is complex, and relies on a variety of different tracers of youth, in an attempt to create as complete, as homogeneous, and as clean of a sample as possible, across all masses and ages younger than ∼30 Myr. A preliminary examination of the data shows that the vast majority of the sources observed to date exhibits spectroscopic signatures of youth, although some fraction of contamination from more evolved sources (main sequence or red giants) is present across all cartons. Future studies employing larger data sets will precisely quantify the contamination level and allow to select cleaner samples. The development of dedicated pipelines to derive accurate stellar parameters is underway.

The previous iterations of SDSS have produced a spectroscopic census of young stars across several selected star-forming regions. These data have been instrumental in understanding the star formation history and the three-dimensional kinematics of star-forming regions as a whole (e.g., Foster et al. 2015; Da Rio et al. 2016; Stutz & Gould 2016; Galli et al. 2019; Kounkel et al. 2022c), as well as properties of individual stars, such as accretion (Campbell et al. 2023), multiplicity (Kounkel et al. 2019), evolutionary properties (e.g., Serna et al. 2021; Cao et al. 2022), and stellar parameters (e.g., Roman-Lopes et al. 2019; Ramírez-Preciado et al. 2020). Similarly to SDSS, other spectroscopic surveys have targeted nearby star-forming regions, such as GALAH (Kos et al. 2021), or with Gaia-ESO (e.g., Sacco et al. 2015; Bouvier et al. 2016). Taken together however, these efforts amounted to only ∼10,000 young stars across a very limited footprint on the sky that generally could not encompass even a given extended population in full. ABYSS will expand the total spectroscopic census of young stars by more than an order of magnitude, without the stringent spatial restrictions that were necessary in the past, and it will substantially increase our ability to probe the recent epoch of star formation in the Galaxy as a whole. Finally, while sophisticated, the homogeneous selection function described in this study is a significant improvement on previous efforts that were specifically tailored to individual star-forming regions (Román-Zúñiga et al. 2023), as it enables a more direct comparison between them.

In the next few years, this survey will be complemented by the 4MOST Survey of Young Stars (4SYS), and we expect future Gaia data releases to improve the processing of the RVS spectra of young stars. However, ABYSS is the first major spectroscopic survey to focus on young stars across the sky and to provide accurate stellar parameters for large, statistical samples. With yearly data releases (DR19 onward) yielding spectra and stellar parameters, it is our hope that these data will be of value to the community.

A.S. gratefully acknowledges support by the Fondecyt Regular (project code 1220610), and ANID BASAL projects ACE210002 and FB210003. C.R.-Z. acknowledges support from projects CONACYT CB2018 A1S-9754, Mexico; and UNAM DGAPA PAPIIT IN112620, Mexico. The research leading to these results has (partially) received funding from the KU Leuven Research Council (grant C16/18/005: PARADISE) and from the BELgian federal Science Policy Office (BELSPO) through PRODEX grant PLATO. K.P.R. acknowledges support from ANID FONDECYT Iniciación 11201161. A.B. acknowledges partial funding by the Deutsche Forschungsgemeinschaft Excellence Strategy—EXC 2094-390783311 and the ANID BASAL project FB210003. R.L.V. acknowledges support from CONACYT through a postdoctoral fellowship within the program "Estancias Posdoctorales por México." B.R.-A. acknowledges funding support from FONDECYT Iniciación grant 11181295 and ANID Basal project FB210003.

Funding for the Sloan Digital Sky Survey V has been provided by the Alfred P. Sloan Foundation, the Heising–Simons Foundation, the National Science Foundation, and the Participating Institutions. SDSS acknowledges support and resources from the Center for High-Performance Computing at the University of Utah. The SDSS website is www.sdss5.org.

SDSS is managed by the Astrophysical Research Consortium for the Participating Institutions of the SDSS Collaboration, including the Carnegie Institution for Science, Chilean National Time Allocation Committee (CNTAC) ratified researchers, the Gotham Participation Group, Harvard University, Heidelberg University, The Johns Hopkins University, L'Ecole polytechnique fédérale de Lausanne (EPFL), Leibniz-Institut für Astrophysik Potsdam (AIP), Max-Planck-Institut für Astronomie (MPIA Heidelberg), Max-Planck-Institut für Extraterrestrische Physik (MPE), Nanjing University, National Astronomical Observatories of China (NAOC), New Mexico State University, The Ohio State University, Pennsylvania State University, Smithsonian Astrophysical Observatory, Space Telescope Science Institute (STScI), the Stellar Astrophysics Participation Group, Universidad Nacional Autónoma de México, University of Arizona, University of Colorado Boulder, University of Illinois at Urbana-Champaign, University of Toronto, University of Utah, University of Virginia, Yale University, and Yunnan University.

This work has made use of data from the European Space Agency (ESA) mission Gaia, 19 processed by the Gaia Data Processing and Analysis Consortium (DPAC). 20 Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement.

Software: TOPCAT (Taylor 2005), PyXCSAO, APOGEE Net.

Footnotes

Please wait… references are loading.
10.3847/1538-4365/acc106