KHDL
KHDL
INSTITUTE OF INNOVATION
ID of Class : 23D1INF50905929
Class : TI001
                                          1
I. THEORETICAL BASIS
1
    What is data science, IBM, https://www.ibm.com/topics/data-science
                                             2
        Machine learning (ML) classification problems are those which
require the given data set to be classified in two or more categories.
For example, whether a person is suffering from a disease X (answer
in Yes or No) can be termed as a classification problem. Another
common example is whether to buy a thing from the online portal
now or wait for couple of months in order to get maximum discount.
Or, if you are planning to buy a car, which car out of available
options is a best buy given your budget.2
    •    Clustering problems
        Clustering,        an     unsupervised             learning           technique,   involves
grouping data objects into clusters so that objects within the same
cluster are similar according to a specific criterion.
                                                     3
brought about extremely positive and effective changes to the
tourism industry.
      Vietravel is currently one of Vietnam's leading travel and
transportation companies. In its 26 years of development, Vietravel
has continually grown and improved. In its business development
strategy until 2030, Vietravel Holdings focuses on building a diverse
ecosystem with three major areas: travel; transportation - aviation;
and trade - services. Gradually, it aims to become a multi-industry
investment business group with the capacity and scope to enter
regional markets. In any field, the group demonstrates its pioneering
role, guiding consumer trends.
      To achieve these goals, Vietravel needs to apply Data Science
applications to further enhance quality in aspects such as:
  •    Customer care
  •    Marketing
  •    Customer service
  •    Investment
      Therefore, Vietravel has applied Data Science to the above
factors, and based on this, the team has also proposed new ideas
from the group and referenced materials.
                                   4
      Specifically, for the current Vietravel website, the utilities for
searching information about travel are clearly and specifically
displayed,    including:   tour   packages,   hotel   bookings,   combo
packages, flight tickets, and booking. The common feature of these
utilities is to provide fields for customers to easily fill in the
information, and then the system will process the data according to
the installed algorithms and provide suggestions suitable for
customers' requirements and desires.
           • Tour packages:
      Necessary input information: departure point, destination,
departure date, and number of days.
Results: various and attractive travel programs with detailed and
specific information.
  •    Hotels:
      Input information: destination or hotel name, number of people,
number of rooms, and check-in/check-out dates.
Results: choices of hotels, room types, along with accompanying
information such as price, services, etc.
                                    5
    Combo packages combining multiple services such as flight
tickets, transportation, etc., also operate similarly.
                                      6
approach to predicting travel demand is based on temporary
information extracted from social media data, rather than using data
from surveys and questionnaires as in traditional methods.
      By applying data science to process this data and information,
Vietravel can capture the changing trends of customers' travel
desires. Specifically, which destinations are potential, and which
types of tourism are popular. With this important supporting
information, Vietravel can rely on it to make investment decisions
and develop subsequent travel products.
                                   7
IV. CLASSIFICATION OF TOURISM PRODUCTS –
CONVENIENCE AND PROFESSIONALISM
                                   8
V. CUSTOMER APPRECIATION – ANOTHER APPLICATION
OF DATA CLASSIFICATION
                                   9
Telesales team can easily contact and listen to feedback about the
company is a top priority.
    When customers participate in appreciation programs, they
provide basic information such as phone number, address, email,
age, family status, etc. Vietravel collects this information through
appreciation program forms, calls from the customer service
department, or directly from customers themselves.
                                  10
Vietravel, this is a good opportunity to not only promote the
company's image but also collect customer information, thereby
capturing not only potential customers but also relatives and
acquaintances who want to use the travel services that the company
has, is, and will provide in the future.
    With the VietravelPlus account, the company can easily classify
the membership levels of cardholders. With the mechanism of
participating in promotional programs, adding Gold points and
Reward points after using the service, the company will know
whether customers are effectively using the services/products
provided by the company and their preference for those products.
    The three card levels that Vietravel is classifying are Silver card,
Gold card, and Platinum card. Different card levels correspond to
                                    11
different customer care approaches by Vietravel. Through clustering
data, customers with the same card color will receive the same
benefits. As for classifying data, it shows the level of card usage by
customers for Vietravel's products when customers with silver cards
seem to have little interest in promotional programs or premium
services of the company.
6.1. Context:
                                       12
Reputable      Passenger      Transportation        Companies     in   2022.
Accordingly, Vietravel continues to lead the Top 10 Reputable Travel
Companies in the Vietnamese tourism industry.
      According to Vietnam Report, after almost two years of
complete shutdown due to the impact of the Covid-19 pandemic, the
Vietnamese tourism industry has shown signs of recovery, especially
since the Government opened tourism from March 15, 2022.
      With   all   the   internal   factors   for   development    and   the
requirements set in the new context, Vietravel needs another
important factor to strongly promote its development plans, which is
investment from potential customers. Therefore, predicting potential
investors is a necessary task.
                                       13
                                        gENDER
Female; 15
Male; 25
Male
Female
0 5 10 15 20 25 30
No Yes
   Inves
   tmen
                                        Debe                Governm         Fixed_    P    G
       t       Mutual   Equity_
                                        nture               ent_Bond        Deposi    P    ol
   piorit      _Funds   Market
                                            s                    s               ts   F    d
       y
   order
                                                                                      2
   1           4        2               1                  1                8         4    0
   2           21       5               1                  4                2         6    1
   3           9        12              4                  1                10        3    1
   4           3        16              3                  7                7         2    2
                                                 14
   5         2       3         3         18            8          2   4
                                                                      1
   6         0       2         8         7             2          3   8
                                                                      1
   7         1       0         20        2             3          0   4
   Stock_market
   participation          No                 Yes            Total
   Female                 4                  11             15
   Male                   1                  24             25
   Factor            People
   Locking Period 1
   Returns           25
   Risk              14
       Objective                                       People
       Growth                                          11
       Income                                          3
       Capital Appreciation                            26
       Purpose                                     People
       Returns                                     2
       Savings for Future                          6
       Wealth Creation                             32
                                    15
*Average    maturity   time,   Investment   tracking    frequency   and
Expected profit margin:
                 Duration                              People
                 1-3 years                               18
                 3-5 years                               19
              Less than 1 year                              2
              Invest_Monitor                           People
                   Daily                                 4
                  Weekly                                 7
                  Monthly                                29
Expect People
10%-20% 3
20%-30% 32
30%-40% 5
Avenue                                       People
Equity                                       10
Fixed Deposits                               9
Mutual Fund                                  18
PPF                                          3
                                  16
Health Care                                                 13
Retirement Plan                                             24
Reason_Equity                                               People
Capital Appreciation                                        30
Dividend                                                    8
Liquidity                                                   2
Reason_Mutual                                         People
Better Returns                                        24
Fund Diversification                                  13
Tax Benefits                                          3
Reason_Bonds                                       People
Assured Returns                                    26
Safe Investment                                    13
Tax Incentives                                     1
Reason_FD                                         People
Fixed Returns                                     18
High Interest Rates                               3
Risk Free                                         19
*Information sources:
                                     17
Source                                                  People
Financial Consultants                                   16
Internet                                                4
Newspapers and Magazines                                14
Television                                              6
  •   Step 2.1: Use data from the "data" file and data sampler to
      split data into two files: data and prediction.
                                    18
•   Step 2.2: Use 3 methods: SVM, Logistic regression, Decision
    tree   to   classify   investment   decisions   and   evaluate   the
    effectiveness of each method.
•   Step 2.3: Select the best-rated method and use it to predict
    data in the "file(2)" file.
       o Model evaluation
•   *Test & Score method evaluation results:
•   *Confusion matrix:
•   Logistic Regression method
                                  19
•   Support Vector method
                            20
 → Model evaluation:
 Evaluate the model based on the results of the Confusion matrix:
 Type 1 and Type 2
errors of the Decision Tree method are the least significant.
 → Conclusion: Choose the Decision Tree as the method used to
 classify data.
        o Using the model to predict potential investors
                                     21
       maturity of 3-5 years and an expected profit margin of 20-
       30%. Investment decisions are mostly influenced by financial
       advisors.
  •    The main goal is to create wealth, prioritizing investments with
       an average maturity of 1-3 years and a profit margin of 20-
       30%. Focus should be placed on approaching potential
       investors through the press, as this is their main source of
       information.
  •    The primary goal and purpose of investment is to create
       wealth, prioritizing investments with an average maturity of 1-
       3 years and a profit margin of 20-30%. Advertising on
       television or launching media campaigns is necessary to reach
       this target group of investors.
       By leveraging data, Vietravel can predict the characteristics of
       potential investors to develop the most effective approach
       strategies. This is just one of many factors and only represents
       a small part of the power of data science and its applications. If
       Vietravel can expand its data sources and invest more in data
       mining, it will undoubtedly create a significant competitive
       advantage.
                                    22
a suitable payment method and follow Vietravel's instructions for
payment.
    Customers can choose to pay with domestic ATM cards from
Vietnamese banks through the Onepay payment gateway or
international payment cards such as VISA, Master Card, American
Express, JCB, or by scanning QR codes through banking/e-wallet
apps.
Method 1: Payment with domestic ATM cards
                                23
Method 2: Payment with international credit/debit cards: VISA,
Mastercard, American Express, JCB
                                 24
Choose other apps:
                                    25
    In addition to the great benefits from diversifying tour booking
methods, a challenge that arises is data synchronization. Imagine
during peak tourism season, when there is only one remaining tour
to a particular destination, and simultaneously, three customers
book it. The complication is that all three customers book the tour
through different methods: via the website, fanpage, and the
Vietravel app. There is a high probability that these three customers
will have overlapping bookings, leading to difficult issues for
Vietravel to resolve: having to notify unsuccessful customers as
soon as possible, affecting profits, and, more importantly, the
company's reputation. Data science can completely solve this issue
through its applications. In reality, throughout its operation, even
though it has served a large number of customers, we rarely see
negative feedback about duplication or canceled tours from
Vietravel. This demonstrates the excellent applications of data
science in ensuring database consistency.
                                 26
X. FACE RECOGNITION
                                   27
decide which group corresponds to which customer group. This final
part requires human intervention, but the workload has been
significantly reduced.
                                     28
                            SUMMARY
    Data science has brought tremendous opportunities for many
industries, including tourism in general and Vietravel in particular.
Modern advances in computing and rapid development of algorithms
have led to the emergence of advanced analyses that go far beyond
traditional business intelligence, allowing for deeper understanding
and better predictions. Meeting the needs of an increasing number
of consumers and processing vast amounts of data, data science
algorithms are essential. Big data has become an important tool for
airlines, hotels, booking websites, and many other websites trying to
improve their services daily. This simply means that tourism in
general and Vietravel can now use big data cost-effectively.
    Big data and analytics can ultimately equip travel companies
with everything they need to understand their target customers and
achieve higher profits, or in other words, gain a competitive
advantage. At the same time, businesses also generate internal
data. This can help them achieve operational excellence, reduce
operating costs, and make smart, customized decisions that
facilitate their employees' work.
                                    29
    Thanks to data analytics tools, travel companies and tour
operators can now better understand their market. They can
evaluate market activities to apply new strategies or fine-tune
existing ones for better results. Travel companies can better
understand their customers to provide suitable services and make
them feel most satisfied. They can identify customer behavior
patterns and adjust their offers to reflect current needs, attracting
more consumers and increasing sales.
    Data analysis also allows companies to evaluate their supply
chain. It enables them to source their products more intelligently
and with more information, thus potentially increasing profit margins
while maintaining competitiveness in their market. Finally, travel
companies can apply new strategies to create new revenue streams,
maximize profits, and establish a better position in the target
market.
                      ACKNOWLEDGMENTS
     First and foremost, we would like to express our deepest
gratitude to Dr. Thai Kim Phung. Throughout our study and
exploration of data science, we have received his care, assistance,
and guidance, which have been incredibly patient and dedicated. His
willingness to spend time explaining the subject matter through
various approaches has helped us better understand the dry
explanations in the textbooks. Furthermore, he has helped us
accumulate more knowledge for a deeper and more complete
perspective on life. From the knowledge he has imparted, we have
                                 30
been able to complete the project: "Applications of Data Science in
Vietravel Travel Company."
        Knowledge      is   infinite,   and   everyone's   ability   to   absorb
knowledge has certain limitations. Therefore, while completing this
project, we are sure to have some shortcomings. We sincerely hope
to receive feedback from our teacher to improve our project.
        We wish our teacher good health, happiness, and success in
his teaching career. Once again, we would like to express our
sincere gratitude.
REFERENCE MATERIAL
                                         31
Data sources :
https://www.kaggle.com/
https://www.kaggle.com/datasets/nitindatta/finance-data
https://www.kaggle.com/code/nitindatta/finance-data-analysis
32