TSARM-UDP: An Efficient Time Series Association Rules Mining Algorithm Based on Up-to-Date Patterns
<p>The framework of TSARM-UDP.</p> "> Figure 2
<p>The difference of calculating Support between Formula (2) and Formula (5).</p> "> Figure 3
<p>The flowchart of the proposed TSARM-UDP.</p> "> Figure 4
<p>Comparisons of mining results on the stock dataset (<math display="inline"><semantics> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> <mo>_</mo> <mi>c</mi> <mi>o</mi> <mi>n</mi> <mi>f</mi> <mo>=</mo> <mn>0.7</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <mi>T</mi> <mo>=</mo> <mn>7</mn> </mrow> </semantics></math>).</p> "> Figure 5
<p>Comparisons of the rule numbers and <math display="inline"><semantics> <msub> <mi>L</mi> <mi>k</mi> </msub> </semantics></math> on the stock dataset (<math display="inline"><semantics> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> <mo>_</mo> <mi>c</mi> <mi>o</mi> <mi>n</mi> <mi>f</mi> <mo>=</mo> <mn>0.8</mn> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <mi>T</mi> <mo>=</mo> <mn>7</mn> </mrow> </semantics></math>).</p> "> Figure 6
<p><math display="inline"><semantics> <mrow> <msub> <mi>L</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>L</mi> <mn>2</mn> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <msub> <mi>L</mi> <mi>k</mi> </msub> </semantics></math>, and rule numbers comparison on the BF dataset (<math display="inline"><semantics> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> <mo>_</mo> <mi>c</mi> <mi>o</mi> <mi>n</mi> <mi>f</mi> <mo>=</mo> <mn>0.6</mn> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <mi>T</mi> <mo>=</mo> <mn>6</mn> </mrow> </semantics></math>).</p> "> Figure 7
<p>Rule numbers and <math display="inline"><semantics> <msub> <mi>L</mi> <mi>k</mi> </msub> </semantics></math> comparison on the BF dataset (<math display="inline"><semantics> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> <mo>_</mo> <mi>c</mi> <mi>o</mi> <mi>n</mi> <mi>f</mi> <mo>=</mo> <mn>0.7</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <mi>T</mi> <mo>=</mo> <mn>6</mn> </mrow> </semantics></math>).</p> "> Figure 8
<p>Rule numbers and <math display="inline"><semantics> <msub> <mi>L</mi> <mi>k</mi> </msub> </semantics></math> comparison on the BF dataset (<math display="inline"><semantics> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> <mo>_</mo> <mi>c</mi> <mi>o</mi> <mi>n</mi> <mi>f</mi> <mo>=</mo> <mn>0.8</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <mi>T</mi> <mo>=</mo> <mn>6</mn> </mrow> </semantics></math>).</p> ">
Abstract
:1. Introduction
- A new TSAR mining framework is proposed in this paper to mine more rules for time-series data with higher accuracy.
- Aiming at the rare patterns that occur only for a period, the proposed algorithm can find more effective association rules.
- The proposed TSAR-UDP method can extract temporal relationships without experienced knowledge and extend the rules’ applicability to the whole dataset.
2. Preliminaries of Our Proposed Algorithm
2.1. Time Series
2.2. Association Rules Mining
- (1)
- Find all frequent items in the original log database by the predefined min_sup.
- (2)
- Generate association rules in frequent items by the predefined min_conf.
3. The Proposed TSARM-UDP
3.1. Time Series Association Rules Mining
3.1.1. One-Dimensional TSARs
Algorithm 1 Generating candidate itemsets |
|
3.1.2. Multi-Dimensional TSARs
Algorithm 2 Generating candidate itemsets |
|
3.2. Up-To-Date Patterns
3.3. The Proposed TSARM-UDP
3.3.1. Description
3.3.2. The Construction of the Algorithm
3.3.3. Set the Algorithm Parameters
3.3.4. An Example
4. Simulation Experiments
4.1. Experiment Results on the Stock Dataset
4.2. Experiment Results on the BF Dataset
4.3. Rules’ Evaluation
5. Conclusions and Further Research
Author Contributions
Funding
Conflicts of Interest
Abbreviations
List of abbreviations | |
ARs | association rules |
ARM | association rules mining |
BF | blast furnace |
TSARM | time-series association rules mining algorithm |
TSARs | time-series association rules |
UDP | up-to-date pattern |
Notation | |
D | the log database |
the number of transactions in the log database | |
i | an item or an itemset |
count(i) | the number of an item’s occurrence in the database |
the minimum Support threshold | |
the minimum Confidence threshold | |
the temporal Support | |
the temporal Confidence | |
the set used to keep the item or itemsets that cannot meet the | |
requirement in the step of generating frequent itemsets | |
the set of frequent i-itemsets in the log database | |
the set of candidate i-itemsets in the log database | |
the set used to save the item or itemsets satisfying | |
Transaction_ID | the ordinal number of the transaction in which the item is located. |
Timelist | the set of the item’s Transaction_ID |
the first Transaction_ID in the Timelist | |
T | the length of time, which is predefined |
a frequent itemset in | |
of in | |
of in | |
the number of items in of | |
the number of items in of |
References
- Jerry, C.W.L.; Wensheng, G.; Philippe, F.V.; Tzung, P.H.; Vincent, S.T. Fast algorithms for mining high-utility itemsets with various discount strategies. Adv. Eng. Inf. 2016, 30, 109–126. [Google Scholar]
- Galit, S.; Bruce, P.C.; Yahav, I.; Patel, N.R.; Lichtendahl, K.C. Data Mining for Business Analytics: Concepts, Techniques, and Applications in R; John Wiley & Sons: Hoboken, NJ, USA, 2017. [Google Scholar]
- John, A.; Church, G.M. Aligning gene expression time-series with time warping algorithms. Bioinformatics 2001, 17, 495–508. [Google Scholar]
- Mridu, S.; Nagwani, N.K. Optimal channel selection on Electroencephalography (EEG) device data using feature re-ranking and rough set theory on eye state classification problem. J. Med. Imaging Health Inform. 2018, 8, 214–222. [Google Scholar]
- Zhiang, W.; Li, C.; Cao, J.; Ge, Y. On Scalability of Association-rule-based recommendation: A unified distributed-computing framework. ACM Trans. Web. 2020, 14, 1–21. [Google Scholar]
- Chih-Wen, C.; Tsai, C.F.; Tsai, Y.H.; Wu, Y.C.; Chang, F.R. Association rule mining for the ordered placement of traditional Chinese medicine containers: An experimental study. Medicine 2020, 99, 102–126. [Google Scholar]
- Alam, S.; Ila, M.; Sickles, R.C. Time series analysis of deregulatory dynamics and technical efficiency: The case of the US airline industry. Int. Econ. Rev. 2000, 41, 203–218. [Google Scholar] [CrossRef]
- Fu-lai, C.; Fu, T.C.; Luk, R.; Ng, V. Evolutionary time-series segmentation for stock data mining. In Proceedings of the IEEE Congress on Evolutionary Computation, ICDM 2002, Maebashi City, Japan, 9–12 December 2002; IEEE Computer Society: Washington, DC, USA, 2002; pp. 83–90. [Google Scholar]
- Matthews, S.G.; Gongora, M.A.; Hopgood, A.A.; Ahmadi, S. Web usage mining with evolutionary extraction of temporal fuzzy association rules. Knowl.-Based Syst. 2013, 54, 66–72. [Google Scholar] [CrossRef] [Green Version]
- Okolica, J.S.; Peterson, G.L.; Mills, R.F.; Grimaila, M.R. Sequence pattern mining with variables. IEEE Trans. Knowl. Data Eng. 2018, 32, 177–187. [Google Scholar] [CrossRef]
- Haupt, R.L.; Haupt, S.E. Practical Genetic Algorithms; Khosravy, M., Gupta, N., Eds.; Wiley-IEEE Publication: New York, NY, USA, 2004; pp. 154–196. [Google Scholar]
- Sacchi, L.; Larizza, C.; Combi, C.; Bellazzi, R. Data mining with temporal abstractions: Learning rules from time-series. Data Min. Knowl. Discov. 2007, 15, 217–247. [Google Scholar] [CrossRef]
- Chen, C.-H.; Lan, G.C.; Hong, T.P.; Lin, S.B. Mining fuzzy temporal association rules by item lifespans. Appl. Soft. Comput. 2016, 41, 265–274. [Google Scholar] [CrossRef]
- Chen, C.-H.; Hong, T.-P.; Tseng, V.S. Fuzzy data mining for time-series data. Appl. Soft. Comput. 2012, 12, 536–542. [Google Scholar] [CrossRef]
- Shirsath, P.A.; Verma, V.K. A recent survey on incremental temporal association rule mining. IJITEE 2013, 13, 2278–3075. [Google Scholar]
- Yang, Y.; Tang, Y. The construction of hierarchical network model and wireless activation diffusion optimization model in English teaching. EURASIP J. Wirel. Commun. Netw. 2020, 20, 1–14. [Google Scholar]
- Park, H.; Jung, J.-Y. SAX-ARM: Deviant event pattern discovery from multivariate time-series using symbolic aggregate approximation and association rule mining. Expert Syst. Appl. 2020, 141, 112950–112961. [Google Scholar] [CrossRef]
- Agrawal, R.; Imieliński, T.; Swami, A. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, 26–28 May 1993; pp. 712–720. [Google Scholar]
- Ghorbani, M.; Abessi, M. A new methodology for mining frequent itemsets on temporal data. IEEE Trans. Eng. Manag. 2017, 64, 566–573. [Google Scholar] [CrossRef]
- Mantovani, M.; Combi, C.; Zeggiotti, M. Discovering and analyzing trend-event patterns on clinical data. In Proceedings of the 2019 IEEE International Conference on Healthcare Informatics, Beijing, China, 10–13 June 2019; pp. 212–226. [Google Scholar]
- Combi, C.; Sabaini, A. Extraction, analysis, and visualization of temporal association rules from interval-based clinical data. In Proceedings of the 2013 Conference on Artificial Intelligence in Medicine in Europe, Murcia, Spain, 7–9 June 2013; pp. 238–247. [Google Scholar]
- Qin, L.X.; Shi, Z.Z. Research on multiple time-series inter-transaction association analysi. Comput. Eng. Appl. 2005, 41, 10–12. [Google Scholar]
- Ruan, G.; Zhang, H.; Plale, B. Parallel and quantitative sequential pattern mining for large-scale interval-based temporal data. In Proceedings of the 2014 IEEE International Conference on Big Data, Sydney, Australia, 27–30 October 2014; pp. 703–710. [Google Scholar]
- Beedkar, K.; Gemulla, R.; Martens, W. A unified framework for frequent sequence mining with subsequence constraints. ACM Trans. Database Syst. 2019, 44, 1–42. [Google Scholar] [CrossRef]
- Combi, C.; Pozzi, G.; Rossato, R. Querying temporal clinical databases on granular trends. J. Biomed. Inform. 2012, 45, 273–291. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hong, T.-P.; Wu, Y.-Y.; Wang, S.-L. An effective mining approach for up-to-date patterns. Expert Syst. Appl. 2009, 36, 9747–9752. [Google Scholar] [CrossRef]
- Wang, L.; Meng, J.; Xu, P.; Peng, K. Mining temporal association rules with frequent itemsets tree. Appl. Soft. Comput. 2018, 62, 817–829. [Google Scholar] [CrossRef]
- Wang, L.; Li, L.L.; Meng, J.Y. Temporal association rules mining algorithm based on frequent item sets tree. Control Decis. 2018, 33, 591–599. [Google Scholar]
- Lin, J.C.-W.; Gan, W.; Hong, T.P.; Tseng, V.S. Efficient algorithms for mining up-to-date high-utility patterns. Adv. Eng. Inform. 2015, 29, 648–661. [Google Scholar] [CrossRef]
- Lin, C.-W.; Hong, T.-P.; Lu, W.-H. Mining up-to-date knowledge based on tree structures. In Proceedings of the 2009 International Conference of Soft Computing and Pattern Recognition, Dalian, China, 14–16 October 2009; pp. 123–127. [Google Scholar]
- Lin, C.-W.; Hong, T.-P. Temporal data mining with up-to-date pattern trees. Expert Syst. Appl. 2011, 38, 15143–15150. [Google Scholar] [CrossRef]
- Namaki, M.H.; Wu, Y.; Song, Q.; Lin, P.; Ge, T. Discovering graph temporal association rules. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, Singapore, 3–7 November 2017; Association for Computing Machinery: New York, NY, USA, 2017; pp. 1697–1706. [Google Scholar]
- Borah, A.; Nath, B. Rare association rule mining from incremental databases. Pattern Anal. Appl. 2020, 23, 113–134. [Google Scholar] [CrossRef]
- ISTANBUL STOCK EXCHANGE Data Set. Available online: https://archive.ics.uci.edu/ml/d (accessed on 22 July 2020).
- Nguyen, D.; Luo, W.; Phung, D.; Venkatesh, S. LTARM: A novel temporal association rule mining method to understand toxicities in a routine cancer treatment. Knowl.-Based Syst. 2018, 161, 313–328. [Google Scholar] [CrossRef]
- Khan, S.; Parkinson, S. Eliciting and utilising knowledge for security event log analysis: An association rule mining and automated planning approach. Expert Syst. Appl. 2018, 113, 116–127. [Google Scholar] [CrossRef] [Green Version]
Transaction ID | Transaction Time | Items |
1 | 2018/9/1 10:00 | b, d, f |
2 | 2018/9/1 10:05 | b, d, f |
3 | 2018/9/1 10:10 | d, f |
4 | 2018/9/1 10:15 | a, d |
5 | 2018/9/1 10:20 | a, b, d |
6 | 2018/9/1 10:25 | d |
7 | 2018/9/1 10:30 | c |
8 | 2018/9/1 10:35 | a, b, c |
9 | 2018/9/1 10:40 | c, f, e |
10 | 2018/9/1 10:45 | b, d |
Item | Timelist | Count |
a | 4, 5, 8 | 3 |
b | 1, 2, 5, 8, 10 | 5 |
c | 7, 8, 9 | 3 |
d | 1, 2, 3, 4, 5, 6, 10 | 7 |
e | 9 | 1 |
f | 1, 2, 3, 9 | 4 |
Item | Timelist | TSupport |
a | 4, 5, 8 | 0.3 |
b | 1, 2, 5, 8, 10 | 0.5 |
c | 7, 8, 9 | 0.3 |
d | 1, 2, 3, 4, 5, 6, 10 | 0.7 |
e | 9 | 0.1 |
f | 1, 2, 3, 9 | 0.4 |
Itemsets | Count | Timelist | Itemsets | Count | Timelist |
() | 1 | {5} | () | 1 | {6} |
() | 2 | {1, 2} | () | 1 | {6} |
() | 0 | Null | () | 0 | Null |
() | 0 | Null | () | 0 | Null |
() | 0 | Null | () | 0 | Null |
() | 0 | Null | () | 0 | Null |
() | 0 | Null | () | 1 | {2} |
() | 0 | Null | () | 0 | Null |
() | 2 | {2, 5} | () | 3 | {1, 2, 3} |
() | 3 | {4, 5, 6} | () | 0 | Null |
Itemsets | Confidence | Lift |
Rules | Accuracy Rate (%) |
Rule 1 | 75 |
Rule 2 | 100 |
Rule 3 | 100 |
Rule 4 | 100 |
min_sup | TSARM-UDP | Nguyen et al. [35] | Khen et al. [36] | FP Tree [22] |
0.3 | 146.6700 s | 4.1460 s | 3.7000 s | 2.9920 s |
0.4 | 58.5040 s | 3.4150 s | 3.4640 s | 2.8190 s |
0.5 | 41.7840 s | 3.2090 s | 3.2700 s | 2.7840 s |
0.6 | 27.6320 s | 3.1430 s | 3.1900 s | 2.7400 s |
0.7 | 11.3200 s | 2.8100 s | 2.9940 s | 2.6320 s |
Input | Descent | Normal Fluctuation | Ascent |
Blast wind volume | <3400 | 3400∼3500 | ≥3500 |
Blast wind temperature | <1170 | 1170∼1190 | ≥1190 |
Blast wind pressure | <335 | 335∼350 | ≥350 |
Oxygen enrichment | <4400 | 4400∼5000 | ≥5000 |
Top temperature | <100 | 100∼140 | ≥140 |
Normal blast velocity | <190 | 190∼200 | ≥200 |
Actual blast velocity | <220 | 220∼230 | ≥230 |
Permeability index (PI) | <23 | 23∼26 | ≥26 |
Blast furnace bosh gas volume | <4400 | 4400∼4500 | ≥4500 |
Theoretical combustion temperature | <2200 | 2200∼2300 | ≥2300 |
Permeability coefficient | <6 | 6∼7 | ≥7 |
Interval Division | Descent | B | Ascent |
Coding | 1 | 2 | 3 |
Input | Encoding Number |
Blast wind volume | 1 |
Blast wind temperature | 2 |
Blast wind pressure | 3 |
Oxygen enrichment | 4 |
Top temperature | 5 |
Normal blast velocity | 6 |
Actual blast velocity | 7 |
Permeability index (PI) | 8 |
Blast furnace bosh gas volume | 9 |
Theoretical combustion temperature | 10 |
Permeability coefficient | 11 |
T | Rule Numbers | T | Rule Numbers |
() | 25 | () | 25 |
() | 27 | () | 29 |
() | 31 | () | 28 |
() | 29 | () | 27 |
() | 27 | () | 25 |
min_sup | TSARM-UDP | Nguyen et al. [35] | Khen et al. [36] | FP Tree [22] |
1.1809 × s | 7.0400 s | 6.8650 s | 5.7770 s | |
438.1290 s | 5.6700 s | 5.8360 s | 4.5320 s | |
162.0220 s | 4.9260 s | 5.1850 s | 4.0640 s | |
45.2970 s | 4.1600 s | 4.6900 s | 3.7080 s | |
29.0880 s | 3.4410 s | 3.4440 s | 3.6820 s |
Rules | Confidence | Lift | T |
0.97952 | 1.1013 | ||
1 | 1.1395 | ||
0.98868 | 1.1266 | ||
0.90352 | 1.0766 | ||
0.98305 | 1.1053 | ||
1 | 1.1395 | ||
0.97952 | 1.1612 | ||
1 | 1.1395 | ||
0.95819 | 1.1359 | ||
0.98675 | 1.1094 | ||
0.98817 | 1.1715 | ||
0.97895 | 1.1605 |
Rules | TConfidence | Lift | T |
0.95491 | 1.0881 | ||
0.95049 | 1.0686 | ||
0.98489 | 1.0599 | ||
0.93668 | 1.0673 | ||
0.94573 | 1.0633 | ||
0.91852 | 1.0466 | ||
0.93654 | 1.053 | ||
0.78473 | 1.0429 | ||
0.95038 | 1.0829 | ||
0.91674 | 1.0307 | ||
0.80067 | 1.0641 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhao, Q.; Li, Q.; Yu, D.; Han, Y. TSARM-UDP: An Efficient Time Series Association Rules Mining Algorithm Based on Up-to-Date Patterns. Entropy 2021, 23, 365. https://doi.org/10.3390/e23030365
Zhao Q, Li Q, Yu D, Han Y. TSARM-UDP: An Efficient Time Series Association Rules Mining Algorithm Based on Up-to-Date Patterns. Entropy. 2021; 23(3):365. https://doi.org/10.3390/e23030365
Chicago/Turabian StyleZhao, Qiang, Qing Li, Deshui Yu, and Yinghua Han. 2021. "TSARM-UDP: An Efficient Time Series Association Rules Mining Algorithm Based on Up-to-Date Patterns" Entropy 23, no. 3: 365. https://doi.org/10.3390/e23030365