[go: up one dir, main page]

0% found this document useful (0 votes)
53 views9 pages

Cross-Selling Strategies in E-Commerce

The document discusses the design of a High Utility Sequential Pattern Mining (HUSPM)-based recommendation system aimed at enhancing cross-selling strategies in e-commerce. It emphasizes the importance of sequential pattern mining for analyzing user behavior and providing personalized recommendations based on the utility of items rather than just their frequency. The proposed methodology involves the extraction of closed high utility patterns from user transaction data to improve recommendation accuracy and system performance.

Uploaded by

R pavani Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views9 pages

Cross-Selling Strategies in E-Commerce

The document discusses the design of a High Utility Sequential Pattern Mining (HUSPM)-based recommendation system aimed at enhancing cross-selling strategies in e-commerce. It emphasizes the importance of sequential pattern mining for analyzing user behavior and providing personalized recommendations based on the utility of items rather than just their frequency. The proposed methodology involves the extraction of closed high utility patterns from user transaction data to improve recommendation accuracy and system performance.

Uploaded by

R pavani Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

DESIGNING CROSS-SELLING STRATEGIES AND

RECOMMENDER SYSTEM IN E-COMMERCE USING


SEQUENTIAL PATTERN MINING
Dr. V. Saritha, K. Rathnakar Reddy, Md. Munaff, R. Pavani

Department of Information Technology, VNR Vignana Jyothi Institute of Engineering


&Technology, Hyderabad, Telangana

ABSTRACT
1 With the advent of WWW and its rapid development, developer have been motivated to create an
emerging web-based applications that provide various services to user. Most organisations rely on
web-based applications to provide services to their stakeholders. As a result of day-to-day activities
performed by various users, these applications generate large volumes of data and its serves as a
huge electronic repository of operational data readily available to perform user behaviour analysis
for decision/policy making. Giving Recommendations has become crucial for improving user
experience, increases sales and revenue as well as customer loyalty in the quickly expanding e-
commerce industry. Unlike traditional recommender system, sequential pattern based recommender
system takes into account the temporal order of user actions .this method is particularly valuable in
domains where the sequence of events matters ,such as streaming services e-commerce platforms
and social media(Ricci et al.)[19].The suggested approach extract sequential patterns from the
users historical purchase data using sequential pattern mining techniques .These patterns are the
employed to create a recommendation model that predicts and recommends interested items to the
users(virk)[3].

KEYWORDS
Sequential pattern mining, Data mining, High Utility Pattern Mining, Closed High Utility patterns,
Cross-selling Strategy, Utility Thresholds, Support Threshold, PEU, ASU, SWC, RSU,
Recommendations

1. INTRODUCTION
Among the grounds for competition in e-commerce industry, recommendation at individual
level to maximize user satisfaction is most essential for success of that application.
Recommendation systems hold major position in the quest by tracking past users practice and
taste for expecting what's coming next as actions. Sequence Pattern Mining (SPM) is one
methodology to find frequent sequence from large transactional datasets, which well describes
user practices in e-commerce. In contrast to static models, sequential models provide improved
insights taking into account the sequence in which items are bought. This paper proposes a
high-utility sequential pattern mining (HUSPM)-based recommendation system to suggest items
not just by frequency, but utility — taking into account the significance or profit value of
itemset (Lin et al.) [4]. Our Java solution evaluates customer sequences, eliminates useless
patterns, and constructs a utility-based recommendation system to improve cross-selling
practices.

2. RELATED WORKS
A) A review on pure array structure and parallel mining strategy
A pure array structure and parallel mining strategy was developed to address memory
overhead in the High Utility Sequential Pattern Mining (HUSPM). The AHUS
algorithm reduces memory consumption by avoiding heavy structures, while AHUS-P
algorithm utilizes shared-memory parallelism to Fastly accelerate the mining process.
Experimental results indicate that both algorithms outperform HUS-Span in terms of
runtime, memory usage and scalability by making them effective for large datasets (Bac
le et al.) [1].
B) A review on Mining compact High Utility sequential patterns
This work introduced CHUSP, an algorithm that mines Closed High Utility Sequential
Patterns (CHUSPs). These patterns are non-redundant and maintain the compactness.
The CHUSP algorithm applies 3 pruning strategies to minimize search space and uses
divide and conquer method for efficient mining. Extensive testing showed robust
performance in terms of execution time and memory efficiency. (Dinh et al.) [2].
C) A Review on HUS-Rec system
This model proposed the High Utility Sequential Pattern Recommendation System
(HUS-Rec), which improves by using HSPRec19 by using high utility sequential
patterns instead of frequent patterns. This system enhances the user item matrix through
generated patterns from both purchase and click stream sequences using PREFIX-
SPAN and HUSPM. Evaluation showed that the HUS-REC system improves
recommendation accuracy (virk et al) [3].
D) A review on Mining of HUSP With Three-Tier MapReduce Model
This research paper which introduced a 3-tier MapReduce framework to address the
drawbacks of single machine HUSPM in large-scale environments. It used id-set and
utility linked list data structures to ensure correctness and delivers high scalability. It is
suitable for big data Integration (Lin et al.) [4].
E) A review on HUSRM
The HUSRM -High Utility Sequential Rule Miner algorithm is developed to generate
high utility sequential rules which incorporates both utility and confidence metrics. This
makes mined rules more suitable for predictive tasks like product recommendation.
Experiments show there is a 25× speedup in execution and 50% memory savings (Zida
et al.) [5].
F) A review on US-Rule: Discovering Utility-driven Sequential Rules
US RULE is an advanced HUSRM algorithm which features tight upper bounds (LEEU,
REEU, etc.) and several pruning strategies. It maximizes performance in sparse, dense,
and long sequence datasets, which achieves faster mining and lower memory usage than
pre-existing techniques. It provides better rule prediction and its very scalable for real-
world recommendation systems (Huang et al.) [6]
G) A review on Mining HUSP from big datasets
To handle very big datasets, this paper introduced a Spark-based HUSPM framework
that features in-memory computing. Compared to Hadoop, Spark drastically reduces
Input/Output overhead. The four stage Map-Reduce enhances performance, especially
in distributed environments, and efficiently mines high-utility sequential patterns (Lin et
al.) [7]
H) A Review on Utility based Association rules
This model proposed 2-phase utility-based recommendations. First, it generates utility-
based association rules from user transaction data. Then, it applies those rules to
recommendation engine. This approach incorporates item frequency and profit which
results in more personalized and profitable product suggestions when compared to
traditional frequent itemset-based systems (Stuti et al.) [8].
3. PROPOSED METHODOLOGY
The system applies High Utility Sequential Pattern Mining (HUSPM)(le et al)[1] to scan and
discover useful sequences from sequence transaction data. The system starts by reading datasets,
the dataset is comprised of user purchase sequence information along with respective utility
values. The system next calculates support and utility thresholds to discard itemset and low-
value sequence such that only very informative sequences are left. Employing this augmented
data it constructs CHUSP (Closed High Utility Sequential Patterns) maps, such maps efficiently
discover high-utility patterns with less redundancy. To issue recommendations this model maps
user session prefixes to the closed patterns discovered so that personalized suggestions are
generated which are based on history. In terms of efficient cost storage computation, the system
employs hash maps for data storage such as ASU (Actual Sequence Utility), PEU (Prefix
extension Utility), and IU-List (Indexed Utility List). This improves scalability and performance
since the model performs extremely well with large sets of transactions without sacrificing
patterns as well as recommendation generation.

3.1 Problem Statement and Definitions


E-commerce recommendation systems deal with massive customer sequential databases, such as
users historical purchase sequences.
Sequential Pattern Mining discover frequent sequential patterns from a sequential database. E-
Commerce recommender system provides recommendations based on the sequential patterns
that match with a user interests.
Defination-1) Item, Itemset and Sequence: Let I={i1,i2,….i9..} be a set of all items. An
Itemset is non-empty un-ordered collection of those items. And Sequence is an ordered list of
Itemset- each item is associated with an internal utility.
Defination-2) Utility of Item in a sequence: Utility of Item in a Sequence is either the quantity
of item{I} in a sequence {s} (Internal Utility) or a static profit/weight associated with the item
(External Utility)
example- u(i1,s)=iu(i1,s) x euu(i1).
Defination-3) Utility of a sequence: Utility of a sequence Se(s) is Se(s)= Σ u(i1, s) overall
items in sequence. It’s represented in the Dataset as SUtility:X.
Defination-4) Support: Support of item{i1} is the number of sequences where I appear at least
one time.
Defination-5) ASU: ASU/map ASU stores the actual utility of patterns over the given Dataset
and computes as ASU (patterns)= Σ u (pattern, s) in sequences s containing patterns and ASU is
known as Actual Sequence Utility Map.
Defination-5) PEU: PEU-Prefix extension Utility Map (map PEU) is an upper bound utility
value estimating the maximum possible utility for patterns and its extensions and its used to
guide pruning decisions before generating full Utility.
Defination-6) SWU: SWU-Sequence Weighted Utility is the sum of utilities where item {I}
appears
Defination-7) RSU: Remaining Sequence Utility of a pattern is the Utility of the remaining
items after current pattern occurrences in a Sequence{s}.
Defination-8) Promising Items: items that satisfy minimum support or SWU threshold are
called promising items and those items are retained after pruning
Defination-9) Index Utility List: mapIUList is a structure that stores indexes occurrences of
each item with their individual utilities and this list is used to compute ASU and PEU values
Defination-10) HUSP: High Utility Sequential Pattern is a pattern where
ASU(pattern) >= minUtilOccupancy x Se(SDB)
Defination-11) Closed HUSP (CHUSP_map): A pattern is a CHUSP if satisfies HUSP
condition and ∄ super pattern q such that pattern ⊂ q ∧ ASU (pattern) = ASU(q)
Defination-12) Utility Threshold and Support Threshold: Utility threshold also called as
minimum utility threshold is the minimum utility value a pattern must have to be considered
high utility and Support Threshold also known as Minimum Support Threshold is the minimum
number of sequences in which an item must appear to be consider frequent or promising
3.2 Data Collection
We are compiling a sequential dataset that consist of historical transaction data, where each line
represents a transaction sequence with item utilities and total sequence utility. Items are
formatted as item[utility], separated by -1(itemset delimiter) and -2 (sequence ends).

The Datasets that are used in this model are Kosarak 10k sequence utility which is a subset of
Kosarak which consist of 10,000 sequences and Another dataset is SIGN sequence utility is a
dataset of sign language which contains approx. 800 sequences.

Example:

1[1] 2[4] -1 3[10] -1 6[9] -1 7[2] -1 5[1] -1 -2 SUtility:27

1[1] - item[utility]

-1 - itemset delimiter

-2 - sequence ends

SUtility - total sequence utility

3.3 Data Preparation


To enhance the quality and consistency of the dataset, we apply several preparation steps. First,
we perform reading input sequences and parse items using Buffer Reader, itemset and utilities
are tokenized and remove the unpromised items based on the support threshold generated and
utility threshold occupancy. We prune the dataset multiple times to improve quality and remove
un-promising items from the dataset which results in creating an in-memory pruned database
and also generating utility lists, actual utility (ASU), and (PEU) maps

3.4 Model Preparation


3.4.1 Map Construction and Indexing
Before mining the patterns, we are Building fast-access data structures:
IU-list (Indexed Utility List): storing the item utilities in each sequence.
PEU maps: PEU holds prefix extension utility of patterns
ASU map: Stores actual utility of news patterns
3.4.2 Pattern Generation
AHUS ALGORITHM
AHUS algorithm involves in Extending the current pattern by adding new items in several ways
and checking that pattern is HUSP (le et al.) [1]. or not and if that is valid the extension of that
pattern continues (le et al.) [1].
Method-1 (le et al) [1]
Input:
 A database of SDB, from which all unpromising items are removed
 λ – Minimum Utility Threshold
 A prefix - P
Output:
 A set of Frequent High Utility Patterns corresponding to prefix p

If PEU(P) < λ
return.
Scan P-projected database:
If euu(P’) >= λ then
a) place into i-exts(I-Concat), or
b) place into s-exts(S-Concat)
Remove low RSU items in i-exts and s-exts
For item i ∈ i -exts do
P’← I-Concatenation(P,i)
Construct iulist(P’)
If ASU(P’) >= λ then
Output P’
Recursively call AHUS (P’, iulist(P’), ASU(P))
For item i ∈ s -exts do
P’← S-Concatenation(P,i)
Construct iulist(P’)
If ASU(P’)>= λ then
Output P’
Recursively call AHUS (P’, iulist(P’), ASU(P’))

From the above algorithm -1 (le et al) [1], Frequent High Utility Sequential Patterns are
generated
CHUSP ALGORITHM
The patterns that are generated are High Utility sequential Patterns from AHUS algorithm (le et
al) [1], which lacks compacts ness and full of redundancies. To improve compactness and retain
non-redundant and most informative patterns use CHUSP algorithm to retain closed patterns
(Dinh et al.) [2]
Method-2 (Dinh et al) [2]
Input:
 Prefix pattern r, Ext pattern r’(r subset/=r’)
 CHUSP_Map -HashMap which contains closed patterns
 ¬CHUSP_Map- HashMap which contains Non closed patterns
 minSup , minUtil
Closed patterns algorithm
if ((support(r’)>=minSup) && (utility(r’)>=minUtil)) then
if support(r)==support(r’) then
Remove r from CHUSP_Map
Add r into ¬CHUSP_Map
Add r’ into CHUSP_Map
else
Add r’ into CHUSP_Map
If r ∈¬CHUSP_Map then
Add r into ¬CHUSP_Map
Return CHUSP_map

3.4.3 Pattern Output


Now the patterns that are generated from the CHUSP algorithm are called the closed patterns
which are most-informative, non-redundant and have high compactness

Figure-1 Final closed patterns

4. RESULT
The Recommendations are based on the Final closed patterns that are obtained . It analyses the
Prefix of the pattern which we give as Input and based on the patten sequences its generates
recommendations

Figure -2 Recommendations
5. CONCLUSION
In the project, we developed a High Utility Sequential Pattern Based Recommendation System
that improves cross-selling strategies by studying users behaviour . By using High Utility
Sequential Pattern Mining (HUSPM), the system can identify frequent as well as profitable item
sequences, which helps in making more accurate and profitable recommendations (Dinh et al.)
[2].

This algorithm manages large-scale transaction data efficiently, eliminates redundant patterns
by pruning out, and constructs a closed high-utility pattern graph to offer precise
recommendations. Our experimental outcome validates the effectiveness of this algorithm that
enhances recommendation accuracy which, in turn, optimizes system performance in pruning
and utility analysis (Lin et al.) [4].

Generally, this model establishes a future framework for intelligent e-commerce. Future
research, using more sophisticated AI techniques like deep learning and reinforcement learning
can make the facilitation of dynamic user behaviour more personalized and scalable (Zhang et
al.) [20].

REFERENCES
[1] Bac Le, Ut Huynh, and Duy-Tai Dinh, "A pure array structure and parallel strategy for high-
utility sequential pattern mining," Expert Systems with Applications, vol. 104, pp. 107–120, 2018.
[2] Tai Dinh, Philippe Fournier-Viger, and Huynh Van Hong, "Mining compact high utility
sequential patterns," NAIS Journal, vol. 43, pp. 1–15, 2021.
[3] Komal Virk, "Improving E-Commerce Recommendations using High Utility Sequential Patterns
of Historical Purchase and Click Stream Data," [Link]. Thesis, University of Windsor, Canada,
2021.
[4] Jerry Chun-Wei Lin, Youcef Djenouri, Gautam Srivastava, Yuanfa Li, and Philip S. Yu,
"Scalable Mining of High-Utility Sequential Patterns with Three-Tier MapReduce Model," ACM
Transactions on Knowledge Discovery from Data, vol. 16, no. 3, pp. 60:1–60:26, Nov. 2021.
[5] Souleymane Zida, Philippe Fournier-Viger, Cheng-Wei Wu, Jerry Chun-Wei Lin, and Vincent S.
Tseng, "Efficient Mining of High-Utility Sequential Rules," in Proceedings of PAKDD, 2016.
[6] Gengsen Huang, Wensheng Gan, Jian Weng, and Philip S. Yu, "US-Rule: Discovering Utility-
driven Sequential Rules," ACM Transactions on Knowledge Discovery from Data, vol. 1, no. 1,
Article 1, pp. 1–22, 2021.
[7] Jerry Chun-Wei Lin, Yuanfa Li, Philippe Fournier-Viger, Youcef Djenouri, and Leon Shyue-
Liang Wang, "Mining High-Utility Sequential Patterns from Big Datasets," in IEEE
International Conference on Big Data (Big Data), pp. 2674–2681, 2019.
[8] Stuti Stuti, Kanika Gupta, Nishant Srivastava, and Ankita Verma, "A Novel Approach of
Product Recommendation Using Utility-Based Association Rules," International Journal of
Information Retrieval Research (IJIRR), vol. 12, no. 1, 2022..
[9] R. Agrawal and R. Srikant, "Fast algorithms for mining association rules in large databases," in
Proceedings of the 20th International Conference on Very Large Data Bases (VLDB '94), pp.
487–499, San Francisco, CA, USA, Morgan Kaufmann Publishers Inc., 1994.
[10] C. F. Ahmed, S. K. Tanbeer, and B.-S. Jeong, "A novel approach for mining high-utility
sequential patterns in sequence databases," ETRI Journal, vol. 32, no. 5, pp. 676–686, 2010.
doi:10.4218/etrij.10.1510.0066
[11] P. Fournier-Viger, J. C.-W. Lin, A. Gomariz, T. Gueniche, A. Soltani, Z. Deng, and H. T. Lam,
"The SPMF open-source data mining library version 2," in Joint European Conference on
Machine Learning and Knowledge Discovery in Databases
[12] M. Joshi and D. Bhalodia, "Mining high utility itemset using graphics processor," in
International Symposium on Intelligent Systems Technologies and Applications, pp. 665–674,
Springer, 2016
[13] B. Le, D.-T. Dinh, V.-N. Huynh, Q.-M. Nguyen, and P. Fournier-Viger, "An efficient algorithm
for hiding high utility sequential patterns," International Journal of Approximate Reasoning,
2018.
[14] M. N. Quang, T. Dinh, U. Huynh, and B. Le, "MHHUSP: An integrated algorithm for mining
and hiding high utility sequential patterns," in Knowledge and Systems Engineering (KSE), 8th
International Conference on, pp. 13–18, IEEE, 2016.
[15] W. Song, Y. Liu, and J. Li, "Bahui: Fast and memory efficient mining of high utility itemsets
based on bitmap," International Journal of Data Warehousing and Mining (IJDWM), vol. 10, no.
1, pp. 1–15, 2014.
[16] A. Veloso, W. Meira, and S. Parthasarathy, "New parallel algorithms for frequent itemset mining
in very large databases," in Proceedings of the 15th Symposium on Computer Architecture and
High Performance Computing, pp. 158–166, IEEE, 2003
[17] F. Zhang, Y. Zhang, and J. D. Bakos, "Accelerating frequent itemset mining on graphics
processing units," The Journal of Supercomputing, vol. 66, no. 1, 2013.
[18] M. J. Zaki, M. Ogihara, S. Parthasarathy, and W. Li, "Parallel data mining for association rules
on shared-memory multi-processors," in Proceedings of the 1996 ACM/IEEE Conference on
Supercomputing, IEEE, 1996
[19] Ricci, Francesco, et al. "Recommender Systems: Introduction and Challenges." *Recommender
Systems Handbook*, Springer, 2015, pp. 1–3
[20] Zhang, Shuai, Lina Yao, and Aixin Sun. "Deep Learning Based Recommender System: A
Survey and New Perspectives." *ACM Computing Surveys*, vol. 52, no. 1, 2019, pp. 1–38.

Authors
keen interest in Data mining, Text mining with
Dr. [Link] is currently working as assistant hands on experience in sequential pattern ming
professor at VNRVJIET with vast experience and compact mining rules and high-utility
and expertise in Data Mining, Text Mining, pattern mining
DBMS with hands-on experience in sequential
pattern mining and high utility pattern mining
.[Link] is currently pursuing a Bachelor’s
and SQL
degree in Information Technology at VNR
VJIET, with a strong academic record. she has a
[Link] Reddy is currently pursuing a keen interest in Data mining, Text mining with
Bachelor’s degree in Information Technology at hands on experience in sequential pattern ming
VNR VJIET, with a strong academic record. He and compact mining rules and high-utility
has a keen interest in Data mining, Text mining pattern mining
with hands on experience in sequential pattern
ming and compact mining rules and high-utility
pattern mining

[Link] is currently pursuing a Bachelor’s


degree in Information Technology at VNR
VJIET, with a strong academic record. He has a

You might also like