| title | microsoftml Python package |
|---|---|
| description | Introduces the Microsoft machine learning algorithms and models for Python, as related to SQL Server machine learning workloads. |
| ms.prod | sql |
| ms.technology | machine-learning |
| ms.date | 11/06/2019 |
| ms.topic | conceptual |
| author | dphansen |
| ms.author | davidph |
| monikerRange | >=sql-server-2017||>=sql-server-linux-ver15||=sqlallproducts-allversions |
[!INCLUDEappliesto-ss-xxxx-xxxx-xxx-md]
microsoftml is a Python35-compatible module from Microsoft providing high-performance machine learning algorithms. It includes functions for training and transformations, scoring, text and image analysis, and feature extraction for deriving values from existing data.
The machine learning APIs were developed by Microsoft for internal machine learning applications and have been refined over the years to support high performance on big data, using multicore processing and fast data streaming. This package originated as a Python equivalent of an R version, MicrosoftML, that has similar functions.
The microsoftml library is distributed in multiple Microsoft products, but usage is the same whether you get the library in SQL Server or another product. Because the functions are the same, documentation for individual microsoftml functions is published to just one location under the Python reference for Microsoft Machine Learning Server. Should any product-specific behaviors exist, discrepancies will be noted in the function help page.
The microsoftml module is based on Python 3.5 and available only when you install one of the following Microsoft products or downloads:
- SQL Server Machine Learning Services
- Microsoft Machine Learning Server 9.2.0 or later
- Python client libraries for a data science client
Note
Full product release versions are Windows-only in SQL Server 2017. Both Windows and Linux are supported for microsoftml in SQL Server 2019.
Algorithms in microsoftml depend on revoscalepy for:
- Data source objects. Data consumed by microsoftml functions are created using revoscalepy functions.
- Remote computing (shifting function execution to a remote SQL Server instance). The revoscalepy library provides functions for creating and activating a remote compute context for SQL server.
In most cases, you will load the packages together whenever you are using microsoftml.
This section lists the functions by category to give you an idea of how each one is used. You can also use the table of contents to find functions in alphabetical order.
| Function | Description |
|---|---|
| microsoftml.rx_ensemble | Train an ensemble of models. |
| microsoftml.rx_fast_forest | Random Forest. |
| microsoftml.rx_fast_linear | Linear Model. with Stochastic Dual Coordinate Ascent. |
| microsoftml.rx_fast_trees | Boosted Trees. |
| microsoftml.rx_logistic_regression | Logistic Regression. |
| microsoftml.rx_neural_network | Neural Network. |
| microsoftml.rx_oneclass_svm | Anomaly Detection. |
| Function | Description |
|---|---|
| microsoftml.categorical | Converts a text column into categories. |
| microsoftml.categorical_hash | Hashes and converts a text column into categories. |
| Function | Description |
|---|---|
| microsoftml.concat | Concatenates multiple columns into a single vector. |
| microsoftml.drop_columns | Drops columns from a dataset. |
| microsoftml.select_columns | Retains columns of a dataset. |
| Function | Description |
|---|---|
| microsoftml.count_select | Feature selection based on counts. |
| microsoftml.mutualinformation_select | Feature selection based on mutual information. |
| Function | Description |
|---|---|
| microsoftml.featurize_text | Converts text columns into numerical features. |
| microsoftml.get_sentiment | Sentiment analysis. |
| Function | Description |
|---|---|
| microsoftml.load_image | Loads an image. |
| microsoftml.resize_image | Resizes an Image. |
| microsoftml.extract_pixels | Extracts pixels from an image. |
| microsoftml.featurize_image | Converts an image into features. |
| Function | Description |
|---|---|
| microsoftml.rx_featurize | Data transformation for data sources |
| Function | Description |
|---|---|
| microsoftml.rx_predict | Scores using a Microsoft machine learning model |
Functions in microsoftml are callable in Python code encapsulated in stored procedures. Most developers build microsoftml solutions locally, and then migrate finished Python code to stored procedures as a deployment exercise.
The microsoftml package for Python is installed by default, but unlike revoscalepy, it is not loaded by default when you start a Python session using the Python executables installed with SQL Server.
As a first step, import the microsoftml package, and import revoscalepy if you need to use remote compute contexts or related connectivity or data source objects. Then, reference the individual functions you need.
from microsoftml.modules.logistic_regression.rx_logistic_regression import rx_logistic_regression
from revoscalepy.functions.RxSummary import rx_summary
from revoscalepy.etl.RxImport import rx_import_datasource