8000 add phone calls sample (#863) · dotnet/machinelearning-samples@e521c53 · GitHub
[go: up one dir, main page]

Skip to content

Commit e521c53

Browse files
guinaoyuyi@microsoft.com
and
yuyi@microsoft.com
authored
add phone calls sample (#863)
Co-authored-by: yuyi@microsoft.com <Yuanxiang.Ying@microsoft.com>
1 parent 2ff213e commit e521c53

File tree

9 files changed

+394
-0
lines changed

9 files changed

+394
-0
lines changed
Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
# Anomaly Detection of Numbers of Phone Calls
2+
3+
| ML.NET version | API type | Status | App Type | Data type | Scenario | ML Task | Algorithms |
4+
|----------------|-------------------|-------------------------------|-------------|-----------|---------------------|---------------------------|-----------------------------|
5+
| v5.2 | Dynamic API | Up-to-date | Console app | .csv files | Call Numbers Anomaly Detection| Time Series - Anomaly Detection | Sr Entire Anomaly Detection, Period Detection |
6+
7+
In this introductory sample, you'll see how to use [ML.NET](https://www.microsoft.com/net/learn/apps/machine-learning-and-ai/ml-dotnet) to detect **anomalies** in a series of number of calls data. In the world of machine learning, this type of task is called TimeSeries Anomaly Detection.
8+
9+
## Problem
10+
We are having data on number of calls over 10 weeks with daily granularity. The data itself has a periodical pattern as the volumn of calls is large is weekdays and small in weekends. We want to find those points that fall out of the regular pattern of the series. In the world of machine learning, this type of task is called Time-Series anomaly detection.
11+
12+
To solve this problem, we will build an ML model that takes as inputs:
13+
* Date
14+
* Number of calls.
15+
16+
and outputs the anomalies in the number of calls.
17+
18+
## Dataset
19+
We have created sample dataset for number of calls. The dataset `phone_calls.csv` can be found [here](./SrCnnEntireDetection/Data/phone_calls.csv)
20+
21+
Format of **Phone Calls DataSet** looks like below.
22+
23+
| timestamp | value |
24+
|--------|--------------|
25+
| 2018/9/3 | 36.69670857 |
26+
| 2018/9/4 | 35.74160571 |
27+
| ..... | ..... |
28+
| 2018/10/3 | 34.49893429 |
29+
| ... | .... |
30+
31+
![Time-Series data](docs/images/data_visualization.png)
32+
33+
The data in Phone Calls dataset is collected in real world transactions with normalization and rescale transformation.
34+
35+
## ML task - Time Series Anomaly Detection
36+
Anomaly detection is the process of detecting outliers in the data. Anomaly detection in time-series refers to detecting time stamps, or points on a given input time-series, at which the time-series behaves differently from what was expected. These deviations are typically indicative of some events of interest in the problem domain: a cyber-attack on user accounts, power outage, bursting RPS on a server, memory leak, etc.
37+
38+
## Solution
39+
To solve this problem, first, we should determine the period of the series. Second, we can extract the periodical component of the series and apply anomaly detection on the residual part of the series. In ML.net, we could use the detect seasonality function to find the period of a given series. Given the period, the STL algorithm decompose the time-series into three components as `Y = T + S + R`, where `Y` is the original series, `T` is the trend component, `S` is the seasonal componnent and `R` is the residual component of the series(Refer to [this](http://www.nniiem.ru/file/news/2016/stl-statistical-model.pdf) paper for more details on this algorithm). Then, SR-CNN detector is applied to detect anomaly on `R` to capture the anomalies(Refer to [this](https://arxiv.org/pdf/1906.03821.pdf) paper for more details on this algorithm).
40+
41+
![Detect-Anomaly-Pipeline](docs/images/detect-anomaly-pipeline.png)
42+
43+
Luckily, ML.net makes the process super simple as we can see in this sample.
44+
45+
### 1. Detect Period
46+
47+
In the first step, we invoke the `DetectSeasonality` function to obtain the period.
48+
49+
```CSharp
50+
int period = mlContext.AnomalyDetection.DetectSeasonality(dataView, inputColumnName);
51+
```
52+
53+
### 2. Detect Anomaly
54+
55+
First, we need to specify the parameters used for SrCnnEntire detector(Please refer to [here](https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.timeseriescatalog.detectentireanomalybysrcnn?view=ml-dotnet#Microsoft_ML_TimeSeriesCatalog_DetectEntireAnomalyBySrCnn_Microsoft_ML_AnomalyDetectionCatalog_Microsoft_ML_IDataView_System_String_System_String_System_Double_System_Int32_System_Double_Microsoft_ML_TimeSeries_SrCnnDetectMode_) for the details on the parameters). Then, we invoke the detector and obtain a view of the output data.
56+
```CSharp
57+
var options = new SrCnnEntireAnomalyDetectorOptions()
58+
{
59+
Threshold = 0.3,
60+
Sensitivity = 64.0,
61+
DetectMode = SrCnnDetectMode.AnomalyAndMargin,
62+
Period = period,
63+
};
64+
var outputDataView = mlContext.AnomalyDetection.DetectEntireAnomalyBySrCnn(dataView, outputColumnName, inputColumnName, options);
65+
```
66+
67+
### 3. Consume results
68+
The result can be retrived by simply enumerate the result. `Anomaly`, `ExpectedValue`, `UpperBoundary` and `LowerBoundary` are some of the useful output columns.
69+
70+
```CSharp
71+
//STEP 5: Get the detection results as an IEnumerable
72+
var predictions = mlContext.Data.CreateEnumerable<PhoneCallsPrediction>(
73+
outputDataView, reuseRowObject: false);
74+
75+
Console.WriteLine("The anomaly detection results obtained.");
76+
var index = 0;
77+
78+
Console.WriteLine("Index\tData\tAnomaly\tAnomalyScore\tMag\tExpectedValue\tBoundaryUnit\tUpperBoundary\tLowerBoundary");
79+
foreach (var p in predictions)
80+
{
81+
if (p.Prediction[0] == 1)
82+
{
83+
Console.WriteLine("{0},{1},{2},{3},{4},{5},{6},{7} <-- alert is on, detecte anomaly", index,
84+
p.Prediction[0], p.Prediction[1], p.Prediction[2], p.Prediction[3], p.Prediction[4], p.Prediction[5], p.Prediction[6]);
85+
}
86+
else
87+
{
88+
Console.WriteLine("{0},{1},{2},{3},{4},{5},{6},{7}", index,
89+
p.Prediction[0], p.Prediction[1], p.Prediction[2], p.Prediction[3], p.Prediction[4], p.Prediction[5], p.Prediction[6]);
90+
}
91+
++index;
92+
93+
}
94+
95+
//Index Data Anomaly AnomalyScore Mag ExpectedValue BoundaryUnit UpperBoundary LowerBoundary
96+
//0,0,0,0.012431224740909462,36.841787256739266,32.92296779138513,41.14206982401966,32.541504689458876
97+
//1,0,0,0.06732467206114204,35.67303618137362,32.92296779138513,39.97331874865401,31.372753614093227
98+
//2,0,0,0.053027383620274836,34.710132999891826,33.06901172138514,39.029491313022824,30.390774686760828
99+
//3,0,0,0.027326808903921952,33.44765248883495,33.215055651385136,37.786086547816545,29.10921842985335
100+
//4,0,0,0.0074169435448767015,28.937110922276364,33.06901172138514,33.25646923540736,24.61775260914537
101+
//5,0,0,0.01068288760963436,5.143895892785781,32.92296779138513,9.444178460066171,0.843613325505391
102+
//6,0,0,0.02901575691006479,5.163325228419392,32.92296779138513,9.463607795699783,0.8630426611390014
103+
//7,0,0,0.015220262187074987,36.76414836240396,32.92296779138513,41.06443092968435,32.46386579512357
104+
//8,0,0,0.029223955855920452,35.77908590657007,32.92296779138513,40.07936847385046,31.478803339289676
105+
//9,0,0,0.05014588266429284,34.547259536635245,32.92296779138513,38.847542103915636,30.246976969354854
106+
//10,0,0,0.006478629327524482,33.55193524820608,33.06901172138514,37.871293561337076,29.23257693507508
107+
//11,0,0,0.0144699438892775,29.091800129624648,32.92296779138513,33.392082696905035,24.79151756234426
108+
//12,0,0,0.00941397738418861,5.154836630338823,32.92296779138513,9.455119197619213,0.8545540630584334
109+
//13,0,0,0.01012680059746895,5.234332502492464,32.92296779138513,9.534615069772855,0.934049935212073
110+
//14,0,0,0.0391359937506989,36.54992549471526,32.92296779138513,40.85020806199565,32.24964292743487
111+
//15,0,0,0.01879091709088552,35.79526470980883,32.92296779138513,40.095547277089224,31.494982142528443
112+
//16,0,0,0.04275209137629126,34.34099013096804,32.92296779138513,38.64127269824843,30.040707563687647
113+
//17,0,0,0.024479312458949517,33.61201516582131,32.92296779138513,37.9122977331017,29.31173259854092
114+
//18,0,0,0.010781906482188448,29.223563320561812,32.92296779138513,33.5238458878422,24.923280753281425
115+
//19,0,0,0.006907498717766534,5.170512168851533,32.92296779138513,9.470794736131923,0.8702296015711433
116+
//20,0,0,0.003183991678813579,5.2614938889462834,32.92296779138513,9.561776456226674,0.9612113216658926
117+
//21,0,0,0.04256581040333137,36.37103858487317,32.92296779138513,40.67132115215356,32.07075601759278
118+
//22,0,0,0.022860533704528126,35.813544599026855,32.92296779138513,40.113827166307246,31.513262031746464
119+
//23,0,0,0.019266922707912835,34.05600492733225,32.92296779138513,38.356287494612644,29.755722360051863
120+
//24,0,0,0.008008656062259012,33.65828319077884,32.92296779138513,37.95856575805923,29.358000623498448
121+
//25,0,0,0.018746201354033914,29.381125690882463,32.92296779138513,33.681408258162854,25.080843123602072
122+
//26,0,0,0.0141022037992637,5.261543539820418,32.92296779138513,9.561826107100808,0.9612609725400283
123+
//27,0,0,0.013396001938040617,5.4873712582971805,32.92296779138513,9.787653825577571,1.1870886910167897
124+
//28,1,0.4971326063712256,0.3521692757832201,36.504694001629254,32.92296779138513,40.804976568909645,32.20441143434886 < --alert is on, detecte anomaly
125+
```
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
2+
Microsoft Visual Studio Solution File, Format Version 12.00
3+
# Visual Studio Version 16
4+
VisualStudioVersion = 16.0.30704.19
5+
MinimumVisualStudioVersion = 10.0.40219.1
6+
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "SrCnnEntireDetection", "SrEntireDetection\SrEntireDetectionConsoleApp\SrCnnEntireDetection.csproj", "{C22D8D2F-D1CE-466D-BECE-46429F7A9903}"
7+
EndProject
8+
Global
9+
GlobalSection(SolutionConfigurationPlatforms) = preSolution
10+
Debug|Any CPU = Debug|Any CPU
11+
Release|Any CPU = Release|Any CPU
12+
EndGlobalSection
13+
GlobalSection(ProjectConfigurationPlatforms) = postSolution
14+
{C22D8D2F-D1CE-466D-BECE-46429F7A9903}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
15+
{C22D8D2F-D1CE-466D-BECE-46429F7A9903}.Debug|Any CPU.Build.0 = Debug|Any CPU
16+
{C22D8D2F-D1CE-466D-BECE-46429F7A9903}.Release|Any CPU.ActiveCfg = Release|Any CPU
17+
{C22D8D2F-D1CE-466D-BECE-46429F7A9903}.Release|Any CPU.Build.0 = Release|Any CPU
18+
EndGlobalSection
19+
GlobalSection(SolutionProperties) = preSolution
20+
HideSolutionNode = FALSE
21+
EndGlobalSection
22+
GlobalSection(ExtensibilityGlobals) = postSolution
23+
SolutionGuid = {04829BD9-3AFA-47D0-8B99-28B41845F774}
24+
EndGlobalSection
25+
EndGlobal
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
timestamp,value
2+
2018/9/3,36.69670857
3+
2018/9/4,35.74160571
4+
2018/9/5,34.11781143
5+
2018/9/6,33.53363571
6+
2018/9/7,29.19957714
7+
2018/9/8,5.18315
8+
2018/9/9,5.324905714
9+
2018/9/10,36.69670857
10+
2018/9/11,35.74160571
11+
2018/9/12,34.11781143
12+
2018/9/13,33.53363571
13+
2018/9/14,29.19957714
14+
2018/9/15,5.18315
15+
2018/9/16,5.324905714
16+
2018/9/17,36.69670857
17+
2018/9/18,35.74160571
18+
2018/9/19,34.11781143
19+
2018/9/20,33.53363571
20+
2018/9/21,29.19957714
21+
2018/9/22,5.18315
22+
2018/9/23,5.324905714
23+
2018/9/24,36.69670857
24+
2018/9/25,35.74160571
25+
2018/9/26,34.11781143
26+
2018/9/27,33.53363571
27+
2018/9/28,29.19957714
28+
2018/9/29,5.18315
29+
2018/9/30,5.324905714
30+
2018/10/1,31.34386429
31+
2018/10/2,36.18100429
32+
2018/10/3,34.49893429
33+
2018/10/4,34.18594143
34+
2018/10/5,29.96867143
35+
2018/10/6,5.240491429
36+
2018/10/7,5.304298571
37+
2018/10/8,37.94839429
38+
2018/10/9,36.3811
39+
2018/10/10,35.76107429
40+
2018/10/11,35.28894143
41+
2018/10/12,31.08465286
42+
2018/10/13,5.931802857
43+
2018/10/14,5.476382857
44+
2018/10/15,36.23001857
45+
2018/10/16,35.51714
46+
2018/10/17,21.95187143
47+
2018/10/18,31.15111714
48+
2018/10/19,28.21996429
49+
2018/10/20,4.532814286
50+
2018/10/21,5.376088571
51+
2018/10/22,36.19748286
52+
2018/10/23,36.05797571
53+
2018/10/24,35.05092286
54+
2018/10/25,34.78529
55+
2018/10/26,31.19532857
56+
2018/10/27,5.655607143
57+
2018/10/28,5.925987143
58+
2018/10/29,26.61867857
59+
2018/10/30,35.35089571
60+
2018/10/31,34.09542571
61+
2018/11/1,28.74181
62+
2018/11/2,28.20479429
63+
2018/11/3,4.849405714
64+
2018/11/4,5.444168571
65+
2018/11/5,33.21586857
66+
2018/11/6,35.69544714
67+
2018/11/7,34.79379714
68+
2018/11/8,34.26969714
69+
2018/11/9,30.07392
70+
2018/11/10,3.18219
71+
2018/11/11,3.964938571
72+
2018/11/12,45.21586857
73+
2018/11/13,35.69544714
74+
2018/11/14,34.79379714
75+
2018/11/15,34.26969714
76+
2018/11/16,30.07392
77+
2018/11/17,5.266295714
78+
2018/11/18,5.386695714
79+
2018/11/19,33.80200857
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
using Microsoft.ML.Data;
2+
3+
namespace SrCnnEntireDetection.DataStructures
4+
{
5+
public class PhoneCallsData
6+
{
7+
[LoadColumn(0)]
8+
public string timestamp;
9+
10+
[LoadColumn(1)]
11+
public double value;
12+
}
13+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
using Microsoft.ML.Data;
2+
3+
namespace SrCnnEntireDetection.DataStructures
4+
{
5+
public class PhoneCallsPrediction
6+
{
7+
//vector to hold anomaly detection results. Including isAnomaly, anomalyScore, magnitude, expectedValue, boundaryUnits, upperBoundary and lowerBoundary.
8+
[VectorType(7)]
9+
public double[] Prediction { get; set; }
10+
}
11+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
using System;
2+
using Microsoft.ML;
3+
using System.IO;
4+
using SrCnnEntireDetection.DataStructures;
5+
using Microsoft.ML.TimeSeries;
6+
using System.Linq;
7+
8+
namespace SrCnnEntireDetection
9+
{
10+
internal static class Program
11+
{
12+
private static string BaseDatasetsRelativePath = @"../../../../Data";
13+
private static string DatasetRelativePath = $"{BaseDatasetsRelativePath}/phone-calls.csv";
14+
15+
private static string DatasetPath = GetAbsolutePath(DatasetRelativePath);
16+
17+
private static MLContext mlContext;
18+
19+
static void Main()
20+
{
21+
// Create MLContext to be shared across the model creation workflow objects
22+
mlContext = new MLContext();
23+
24+
//Load the data into IDataView.
25+
//This dataset is used to detect anomaly in time-series dataset.
26+
IDataView dataView = mlContext.Data.LoadFromTextFile<PhoneCallsData>(path: DatasetPath, hasHeader: true, separatorChar: ',');
27+
28+
//To detech temporay changes in the pattern
29+
DetectAnomaly(dataView);
30+
31+
Console.WriteLine("=============== End of process, hit any key to finish ===============");
32+
33+
Console.ReadLine();
34+
}
35+
36+
static void DetectAnomaly(IDataView dataView)
37+
{
38+
Console.WriteLine("===============Detect anomalies in pattern===============");
39+
40+
//STEP 1: Specify the input column and output column names.
41+
string inputColumnName = nameof(PhoneCallsData.value);
42+
string outputColumnName = nameof(PhoneCallsPrediction.Prediction);
43+
44+
//STEP 2: Detect period on the given series.
45+
int period = mlContext.AnomalyDetection.DetectSeasonality(dataView, inputColumnName);
46+
Console.WriteLine("Period of the series is: {0}.", period);
47+
48+
//STEP 3: Setup the parameters
49+
var options = new SrCnnEntireAnomalyDetectorOptions()
50+
{
51+
Threshold = 0.3,
52+
Sensitivity = 64.0,
53+
DetectMode = SrCnnDetectMode.AnomalyAndMargin,
54+
Period = period,
55+
};
56+
57+
//STEP 4: Invoke SrCnn algorithm to detect anomaly on the entire series.
58+
var outputDataView = mlContext.AnomalyDetection.DetectEntireAnomalyBySrCnn(dataView, outputColumnName, inputColumnName, options);
59+
60+
//STEP 5: Get the detection results as an IEnumerable
61+
var predictions = mlContext.Data.CreateEnumerable<PhoneCallsPrediction>(
62+
outputDataView, reuseRowObject: false);
63+
64+
Console.WriteLine("The anomaly detection results obtained.");
65+
var index = 0;
66+
67+
Console.WriteLine("Index\tData\tAnomaly\tAnomalyScore\tMag\tExpectedValue\tBoundaryUnit\tUpperBoundary\tLowerBoundary");
68+
foreach (var p in predictions)
69+
{
70+
if (p.Prediction[0] == 1)
71+
{
72+
Console.WriteLine("{0},{1},{2},{3},{4},{5},{6},{7} <-- alert is on, detecte anomaly", index,
73+
p.Prediction[0], p.Prediction[1], p.Prediction[2], p.Prediction[3], p.Prediction[4], p.Prediction[5], p.Prediction[6]);
74+
}
75+
else
76+
{
77+
Console.WriteLine("{0},{1},{2},{3},{4},{5},{6},{7}", index,
78+
p.Prediction[0], p.Prediction[1], p.Prediction[2], p.Prediction[3], p.Prediction[4], p.Prediction[5], p.Prediction[6]);
79+
}
80+
++index;
81+
82+
}
83+
84+
Console.WriteLine("");
85+
86+
//Index Data Anomaly AnomalyScore Mag ExpectedValue BoundaryUnit UpperBoundary LowerBoundary
87+
//0,0,0,0.012431224740909462,36.841787256739266,32.92296779138513,41.14206982401966,32.541504689458876
88+
//1,0,0,0.06732467206114204,35.67303618137362,32.92296779138513,39.97331874865401,31.372753614093227
89+
//2,0,0,0.053027383620274836,34.710132999891826,33.06901172138514,39.029491313022824,30.390774686760828
90+
//3,0,0,0.027326808903921952,33.44765248883495,33.215055651385136,37.786086547816545,29.10921842985335
91+
//4,0,0,0.0074169435448767015,28.937110922276364,33.06901172138514,33.25646923540736,24.61775260914537
92+
//5,0,0,0.01068288760963436,5.143895892785781,32.92296779138513,9.444178460066171,0.843613325505391
93+
//6,0,0,0.02901575691006479,5.163325228419392,32.92296779138513,9.463607795699783,0.8630426611390014
94+
//7,0,0,0.015220262187074987,36.76414836240396,32.92296779138513,41.06443092968435,32.46386579512357
95+
//8,0,0,0.029223955855920452,35.77908590657007,32.92296779138513,40.07936847385046,31.478803339289676
96+
//9,0,0,0.05014588266429284,34.547259536635245,32.92296779138513,38.847542103915636,30.246976969354854
97+
//10,0,0,0.006478629327524482,33.55193524820608,33.06901172138514,37.871293561337076,29.23257693507508
98+
//11,0,0,0.0144699438892775,29.091800129624648,32.92296779138513,33.392082696905035,24.79151756234426
99+
//12,0,0,0.00941397738418861,5.154836630338823,32.92296779138513,9.455119197619213,0.8545540630584334
100+
//13,0,0,0.01012680059746895,5.234332502492464,32.92296779138513,9.534615069772855,0.934049935212073
101+
//14,0,0,0.0391359937506989,36.54992549471526,32.92296779138513,40.85020806199565,32.24964292743487
102+
//15,0,0,0.01879091709088552,35.79526470980883,32.92296779138513,40.095547277089224,31.494982142528443
103+
//16,0,0,0.04275209137629126,34.34099013096804,32.92296779138513,38.64127269824843,30.040707563687647
104+
//17,0,0,0.024479312458949517,33.61201516582131,32.92296779138513,37.9122977331017,29.31173259854092
105+
//18,0,0,0.010781906482188448,29.223563320561812,32.92296779138513,33.5238458878422,24.923280753281425
106+
//19,0,0,0.006907498717766534,5.170512168851533,32.92296779138513,9.470794736131923,0.8702296015711433
107+
//20,0,0,0.003183991678813579,5.2614938889462834,32.92296779138513,9.561776456226674,0.9612113216658926
108+
//21,0,0,0.04256581040333137,36.37103858487317,32.92296779138513,40.67132115215356,32.07075601759278
109+
//22,0,0,0.022860533704528126,35.813544599026855,32.92296779138513,40.113827166307246,31.513262031746464
110+
//23,0,0,0.019266922707912835,34.05600492733225,32.92296779138513,38.356287494612644,29.755722360051863
111+
//24,0,0,0.008008656062259012,33.65828319077884,32.92296779138513,37.95856575805923,29.358000623498448
112+
//25,0,0,0.018746201354033914,29.381125690882463,32.92296779138513,33.681408258162854,25.080843123602072
113+
//26,0,0,0.0141022037992637,5.261543539820418,32.92296779138513,9.561826107100808,0.9612609725400283
114+
//27,0,0,0.013396001938040617,5.4873712582971805,32.92296779138513,9.787653825577571,1.1870886910167897
115+
//28,1,0.4971326063712256,0.3521692757832201,36.504694001629254,32.92296779138513,40.804976568909645,32.20441143434886 < --alert is on, detecte anomaly
116+
}
117+
118+
public static string GetAbsolutePath(string relativePath)
119+
{
120+
FileInfo _dataRoot = new FileInfo(typeof(Program).Assembly.Location);
121+
string assemblyFolderPath = _dataRoot.Directory.FullName;
122+
123+
string fullPath = Path.Combine(assemblyFolderPath, relativePath);
124+
125+
return fullPath;
126+
}
127+
}
128+
}

0 commit comments

Comments
 (0)
0