DOC: Deprecate pandas.SparseArray

pandas-dev · jreback · Jan 5, 2020 · Jan 3, 2020 · Jan 3, 2020 · Jan 3, 2020
commit 23696b248c22aa044b566bce37ddcd9878f0f9a6
diff --git a/doc/source/development/contributing_docstring.rst b/doc/source/development/contributing_docstring.rst
@@ -399,7 +399,7 @@ DataFrame:
 * DataFrame
 * pandas.Index
 * pandas.Categorical
-* pandas.SparseArray
+* pandas.arrays.SparseArray
 
 If the exact type is not relevant, but must be compatible with a numpy
 array, array-like can be specified. If Any type that can be iterated is

diff --git a/doc/source/getting_started/basics.rst b/doc/source/getting_started/basics.rst
@@ -1951,7 +1951,7 @@ documentation sections for more on each type.
 | period            | :class:`PeriodDtype`      | :class:`Period`    | :class:`arrays.PeriodArray`   | ``'period[<freq>]'``,                   | :ref:`timeseries.periods`     |
 | (time spans)      |                           |                    |                               | ``'Period[<freq>]'``                    |                               |
 +-------------------+---------------------------+--------------------+-------------------------------+-----------------------------------------+-------------------------------+
-| sparse            | :class:`SparseDtype`      | (none)             | :class:`SparseArray`          | ``'Sparse'``, ``'Sparse[int]'``,        | :ref:`sparse`                 |
+| sparse            | :class:`SparseDtype`      | (none)             | :class:`arrays.SparseArray`   | ``'Sparse'``, ``'Sparse[int]'``,        | :ref:`sparse`                 |
 |                   |                           |                    |                               | ``'Sparse[float]'``                     |                               |
 +-------------------+---------------------------+--------------------+-------------------------------+-----------------------------------------+-------------------------------+
 | intervals         | :class:`IntervalDtype`    | :class:`Interval`  | :class:`arrays.IntervalArray` | ``'interval'``, ``'Interval'``,         | :ref:`advanced.intervalindex` |

diff --git a/doc/source/getting_started/dsintro.rst b/doc/source/getting_started/dsintro.rst
@@ -741,7 +741,7 @@ implementation takes precedence and a Serie
8000
s is returned.
    np.maximum(ser, idx)
 
 NumPy ufuncs are safe to apply to :class:`Series` backed by non-ndarray arrays,
-for example :class:`SparseArray` (see :ref:`sparse.calculation`). If possible,
+for example :class:`arrays.SparseArray` (see :ref:`sparse.calculation`). If possible,
 the ufunc is applied without converting the underlying data to an ndarray.
 
 Console display

diff --git a/doc/source/reference/arrays.rst b/doc/source/reference/arrays.rst
@@ -444,13 +444,13 @@ Sparse data
 -----------
 
 Data where a single value is repeated many times (e.g. ``0`` or ``NaN``) may
-be stored efficiently as a :class:`SparseArray`.
+be stored efficiently as a :class:`arrays.SparseArray`.
 
 .. autosummary::
    :toctree: api/
    :template: autosummary/class_without_autosummary.rst
 
-   SparseArray
+   arrays.SparseArray
 
 .. autosummary::
    :toctree: api/

diff --git a/doc/source/user_guide/sparse.rst b/doc/source/user_guide/sparse.rst
@@ -15,7 +15,7 @@ can be chosen, including 0) is omitted. The compressed values are not actually s
 
    arr = np.random.randn(10)
    arr[2:-2] = np.nan
-   ts = pd.Series(pd.SparseArray(arr))
+   ts = pd.Series(pd.arrays.SparseArray(arr))
    ts
 
 Notice the dtype, ``Sparse[float64, nan]``. The ``nan`` means that elements in the
@@ -51,7 +51,7 @@ identical to their dense counterparts.
 SparseArray
 -----------
 
-:class:`SparseArray` is a :class:`~pandas.api.extensions.ExtensionArray`
+:class:`arrays.SparseArray` is a :class:`~pandas.api.extensions.ExtensionArray`
 for storing an array of sparse values (see :ref:`basics.dtypes` for more
 on extension arrays). It is a 1-dimensional ndarray-like object storing
 only values distinct from the ``fill_value``:
@@ -61,7 +61,7 @@ only values distinct from the ``fill_value``:
    arr = np.random.randn(10)
    arr[2:5] = np.nan
    arr[7:8] = np.nan
-   sparr = pd.SparseArray(arr)
+   sparr = pd.arrays.SparseArray(arr)
    sparr
 
 A sparse array can be converted to a regular (dense) ndarray with :meth:`numpy.asarray`
@@ -144,7 +144,7 @@ to ``SparseArray`` and get a ``SparseArray`` as a result.
 
 .. ipython:: python
 
-   arr = pd.SparseArray([1., np.nan, np.nan, -2., np.nan])
+   arr = pd.arrays.SparseArray([1., np.nan, np.nan, -2., np.nan])
    np.abs(arr)
 
 
@@ -153,7 +153,7 @@ the correct dense result.
 
 .. ipython:: python
 
-   arr = pd.SparseArray([1., -1, -1, -2., -1], fill_value=-1)
+   arr = pd.arrays.SparseArray([1., -1, -1, -2., -1], fill_value=-1)
    np.abs(arr)
    np.abs(arr).to_dense()
 
@@ -194,7 +194,7 @@ From an array-like, use the regular :class:`Series` or
 .. ipython:: python
 
    # New way
-   pd.DataFrame({"
67E6
A": pd.SparseArray([0, 1])})
+   pd.DataFrame({"A": pd.arrays.SparseArray([0, 1])})
 
 From a SciPy sparse matrix, use :meth:`DataFrame.sparse.from_spmatrix`,
 
@@ -256,10 +256,10 @@ Instead, you'll need to ensure that the values being assigned are sparse
 
 .. ipython:: python
 
-   df = pd.DataFrame({"A": pd.SparseArray([0, 1])})
+   df = pd.DataFrame({"A": pd.arrays.SparseArray([0, 1])})
    df['B'] = [0, 0]  # remains dense
    df['B'].dtype
-   df['B'] = pd.SparseArray([0, 0])
+   df['B'] = pd.arrays.SparseArray([0, 0])
    df['B'].dtype
 
 The ``SparseDataFrame.default_kind`` and ``SparseDataFrame.default_fill_value`` attributes

diff --git a/doc/source/whatsnew/v1.0.0.rst b/doc/source/whatsnew/v1.0.0.rst
@@ -577,6 +577,7 @@ Deprecations
   it is recommended to use ``json_normalize`` as :func:`pandas.json_normalize` instead (:issue:`27586`).
 - :meth:`DataFrame.to_stata`, :meth:`DataFrame.to_feather`, and :meth:`DataFrame.to_parquet` argument "fname" is deprecated, use "path" instead (:issue:`23574`)
 - The deprecated internal attributes ``_start``, ``_stop`` and ``_step`` of :class:`RangeIndex` now raise a ``FutureWarning`` instead of a ``DeprecationWarning`` (:issue:`26581`)
+- ``pandas.SparseArray`` has been deprecated.  Use ``pandas.arrays.SparseArray`` (:class:`arrays.SparseArray`) instead. (:issue:`30642`)
 
 **Selecting Columns from a Grouped DataFrame**