[go: up one dir, main page]

0% found this document useful (0 votes)
35 views35 pages

Introduction To Multimedia Big Data Computing For Iot

Uploaded by

Nidaa Flaih
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views35 pages

Introduction To Multimedia Big Data Computing For Iot

Uploaded by

Nidaa Flaih
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/334546779

Introduction to Multimedia Big Data Computing for IoT

Chapter in Intelligent Systems Reference Library · January 2020


DOI: 10.1007/978-981-13-8759-3_1

CITATIONS READS

24 4,352

4 authors:

Sharmila Arunkumar Dhananjay Kumar


Pondicherry Engineering College National Institute of Technology Sikkim
21 PUBLICATIONS 276 CITATIONS 8 PUBLICATIONS 74 CITATIONS

SEE PROFILE SEE PROFILE

Pramod Kumar Alaknanda Ashok


Krishna Engineering College G.B.Pant University of Agriculture and Technology, Pantnager, Uttarakhand, India
36 PUBLICATIONS 416 CITATIONS 113 PUBLICATIONS 989 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Dhananjay Kumar on 22 May 2020.

The user has requested enhancement of the downloaded file.


Introduction to Multimedia Big Data
Computing for IoT

Sharmila, Dhananjay Kumar, Pramod Kumar and Alaknanda Ashok

Abstract The headway of new technology, the Internet of Things (IoT) assumes
an active and central role in smart homes, wearable gadgets, agricultural machinery,
retail analytics, engagement on energy resources, and healthcare. The boom of the
internet and mobility support this proliferation in all these smart things, and massive
production of multimedia big data of different formats (such as images, videos, and
audios) daily. Multimedia applications and services provide more opportunities to
compute multimedia big data. Most of the data generated from IoT devices such as
a sensor in the devices, actuators, home appliances, and social media. In the near
future, IoT will have a significant impact in broader domains such as healthcare, smart
energy grids and smart cities in the name of IoT big data applications. More research
work has been carried out in the multimedia big data in the different aspects such as
acquisition of data, storage, mining, security, and retrieval of data. However, a few
research work offers a comprehensive survey of the multimedia big data computing
for IoT. This chapter addresses the gap between multimedia big data challenges
in IoT, and multimedia big data solutions by offering the present multimedia big
data framework, their advantages, and limitations of the existing techniques, and
the potential applications in IoT. It also presents a comprehensive overview of the
multimedia big data computing for IoT applications, fundamental challenges, and
research openings for multimedia big data era.

Sharmila (B) · D. Kumar · P. Kumar


Department of Computer Science Engineering, Krishna Engineering College, Ghaziabad 201007,
Uttar Pradesh, India
e-mail: r.sharmila@krishnacollege.ac.in
D. Kumar
e-mail: dhananjay.kumar@krishnacollege.ac.in
P. Kumar
e-mail: pramodkumar.hod@krishnacollege.ac.in
A. Ashok
Women Institue of Technology, Dehradun, Uttarakhand Technical University, Dehradun,
Uttarakhand, India
e-mail: alakn@rediff.com

© Springer Nature Singapore Pte Ltd. 2020 3


S. Tanwar et al. (eds.), Multimedia Big Data Computing
for IoT Applications, Intelligent Systems Reference Library 163,
https://doi.org/10.1007/978-981-13-8759-3_1
4 Sharmila et al.

Keywords Big data · Internet of things · Multimedia data · Unstructured data ·


Data computing

1 Introduction

Regular ascend in new technologies and their accessibility coupled with the avail-
ability of multimedia sources, the rapid and extensive use of multimedia data such as
videos, audios, images, and text have been increasing day by day. Currently, sources
of multimedia big data are YouTube, Facebook, Flickr, iCloud, Instagram, Twitter,
etc. For example, every minute, the people are uploading 100 h of videos in YouTube,
per day the user send approximately 500 million of messages in Twitter; nearly, 20
billion photos are in Instagram [1]. The statistical analysis illustrates that due to the
multimedia data sharing over the internet has reached nearly 6,130 PB every month
in the year 2016. In 2020, the digital data rate surpasses 40ZB [2]. From this analysis,
each person in the world generates nearly 5,200 GB of data.
Due to the advancement in the technology, the people spend most of the time on
the internet and social networks to share and communicate their information in the
form of multimedia data [3] such as audio, videos, text, images, etc. Multimedia big
data is considered as a large volume of the information. Such multimedia big data
is characterized in terms of its massive volume, diverse Variety, and rapid velocity.
These data are mostly unstructured and may contain much noisy information. The
processing and analyzing of these data becomes difficult using the traditional data
handling and analytic tools because the traditional datasets, which consist of text
and number. Therefore, the multimedia big data requires more extensive and sophis-
ticated solutions to handle the large volume of unstructured data [4]. The major
problem which needs to be analyzed efficiently and effectively by multimedia big
data analytics such as data handling, data mining, visualizing, and understanding the
different datasets generated by multimedia sources to handle real-time challenges.
Multimedia applications and services provide more opportunities to compute mul-
timedia big data. By 2020, it is anticipated that 4 × 10ˆ24 bytes may be generated.
Studies lead by CISCO, and IBM states that 2.5 quintillions of data are generated
each day making it equivalent to 5200 GB per person in the universe. Most of the
data is generated from IoT devices such as a sensor in the devices, actuators, home
appliances, and social media. Internet of Things (IoT) also offers new challenges
to multimedia big data owing to the mobility of IoT devices, data gathering from
omnipresent sensor devices, and Quality of Experience (QoE). In this chapter, an
extensive overview of the multimedia big data challenges, impact of multimedia big
data in IoT, characteristics of multimedia big data computing in 10 V’s perspective,
and further, addressed the opportunities and future research direction of multimedia
big data in IoT.
Introduction to Multimedia Big Data Computing for IoT 5

1.1 Big Data Era

The big data concept is essential to understand the characteristics, challenges, and
opportunities for multimedia big data. The following section provides the dawn of big
data and its challenges. Over the past two decades, the amount of data has increased
in a huge amount in different fields. In 2011, the International Data Corporation
(IDC) studied and revealed that the entire volume of data generated and the size of
data copied has grown ninefold within 5 years worldwide to 01.8 × 〖10〗ˆ21 Bytes
(ZB) of data. Shortly, this numeral twice at least every 2 years [5]. Due to the massive
growth in data globally, big data is predominantly utilized for explaining the huge
amount of datasets. Big data needs much instant analysis as compared to traditional
dataset because of unstructured data. Recently, industries and government agencies
development an interest in this enormous volume of data and declared the first plans
in the direction of research and applications in big data [6]. The big data challenges
and concerns are extensively reported in public media [7–9]. Big data provides novel
opportunities for realizing new values, to gather detailed knowledge about concealed
values and also acquires in what way the data is to organize and manage multimedia
datasets efficiently. At present, a large volume of data is generating rapidly from
the source of Internet. For example, Facebook produces over 10 PB (Petabyte) of
data log per month; Google deals with 100 s of PB of data, for online trading,
Alibaba produces tens of terabyte of data for per day [10]. Advancement of IoT also
contributes significantly to generating a large amount of data rapidly. For example,
in YouTube, people are uploading an average of 72 h of videos per minute [10].
There is no abstract definition for big data. In 2001, Doung Laney addressed the
issues and chances took by enlarged data concerning the 3 V’s model, i.e., Volume,
Velocity, and Variety. IBM [11] and Microsoft research department [12] have been
used 3 V’s model to outline the big data within the subsequent fifteen years. The
3 V’s model represents Velocity, Volume, and Variety [13]. The Volume represents
the large volume of data generation and collection, Velocity represents the speed of
data generation, and Variety means the diverse forms of data which contain struc-
tured, unstructured, and semi-structured data such as text, audio, videos, web pages,
etc. Apache Hadoop well stated the big data as the traditional computers which not
able to process, and analyses the datasets in the year 2010 [14]. In 2011, McKin-
sey & Company defined big data as the succeeding level for the invention, rivalry,
and productivity. In 2011, big data ranged from TB to PB [15]. The key features
addressed by McKinsey & Company include increasingly growing of big data as
well as management of big data.
The traditional database technologies could not manage the big data. Though,
people still have different views, including the most powerful important frontrunner
in the investigation fields of big data is International Data Corporation (IDC). IDC
defines the big data as the new-fangled advancement of technologies and architec-
tures, intended to retrieve the value economically from a huge amount of a diverse
variety of data. Further, the big data considered as 4 V’s such as Volume, Variety,
Velocity, and Value. This characterization addressed the utmost difficult part in big
6 Sharmila et al.

data, which is in what way to extract the values from a large volume of datasets.
The extensive discussions have been carried out by academician and industry on the
characterization of big data [16].

1.2 Big Data Challenges

The big data provides more challenges such as data storage, to manage the data,
data acquisition, and analysis. Traditional Relational Database Management System
(RDBMS) is not suitable for unstructured and semi-structured data. The database
management and analysis relies on RDBMS, which uses more expensive hardware.
The traditional relational database management system could not manage the large
capacity and diversity of big data concerning different types of data and sources.
On a different perspective, the research community has proposed a solution to han-
dle a large volume of big data. For example, distributed file system and NoSQL
[17] databases provide the permanent solution to store and manage the large-scale
chaotic datasets, and the cloud computing provides a solution to satisfy the needs on
infrastructure for big data. Various technologies are developed for the applications
of big data applications. Some author [18] addressed the issues and difficulties of the
big data applications.
Some of the big data challenges are as follows:
• Data Representation: The different levels of big datasets such as structure, seman-
tics, granularity, and openness. The main goal of data representation is that the
data is more significant for computer analysis and user comprehensible. The inap-
propriate way of data representation reduces the originality of data and analysis.
An efficient data representation achieves an efficient data operation on datasets.
• Redundancy reduction and data reduction: Big datasets have a large number
of redundant data. It is an efficient method to decrease the highly redundant data
generated by sensor networks from IoT applications and reduces the cost of the
whole system.
• Analytical mechanism: Within the limited amount of period, the analytical mech-
anisms of big data process the vast volume of heterogeneous data. Traditional
RDBMS has the limitation of scalability and expandability, which could not
encounter the performance requirements. The non-relational databases system
could process the unstructured data. It is the unique advantage of non-relational
databases system; still, some problems are encountered in terms of performance
and specific applications. The best solution to overcome the tradeoff of relational
and non-relational databases for big data is mixed database architecture (Facebook
and Taobao), which integrates the advantages of both.
• Expendability and Scalability: The logical scheme and algorithm for big data
should sustain the current as well as forthcoming datasets and process the enormous
growth of complex data.
Introduction to Multimedia Big Data Computing for IoT 7

• Energy Management: The energy consumption is a significant problem, which


brings the attention of economy of the country. The different operations of multi-
media big data such as acquisition, processing, analysis, storing, and broadcasting
of the huge volume of big data consumes more energy. The system-level power
depletion and managing established to ensure the expandability and accessibility
of big data.

1.3 Big Data Applications in Multimedia Big Data

The multimedia big data management system depends on the big data techniques to
process and manipulate the multimedia big data efficiency.
The application of big data in multimedia big data analytics are as follows,
• Social Networks: Many research works have been performed on social network
big data analysis [19]. Tufeki et al., analyses the challenges of social activities and
behaviors of people on Twitter hashtags, which has a large number of datasets,
visibility, and ease of access. Ma et al. address the new emerging technology called
social recommender system, and it is mainly used in social networks to share
multimedia information. Davidson et al. presented YouTube video framework
activities in which it integrates social information and personalizes videos in a
recommendation system [18].
• Smartphones: Recently, smartphones have overhauled the usage of other elec-
tronic devices such as personal computers, and laptops. The smartphones have
advanced technologies and capabilities such as Bluetooth, Camera, network con-
nection, Global Positioning System (GPS), and high potential Central Processing
Unit (CPU), etc. Using smartphones, the user can manipulate, process, and access
the heterogeneous multimedia data. Mobile sensing issues of smartphones sensors
and data analyses such as data sharing, influence, security, and privacy issues are
addressed by Lane et al. [19]. The other challenges of smartphones are investigated
such as the large volume of data, security, and multimedia cloud computing.
• Surveillance Videos: The significant sources of multimedia big data is surveil-
lance videos. Xu et al. [20] present the dawn of big data innovative solutions for
multimedia big data such as volume, velocity, variety, and value of multimedia
generates from surveillance sources such as traffic control, IoT, and criminal inves-
tigation. Shyu et al. [21] present the concept of how to detect semantic concept
from the surveillance videos. One of the promising applications of multimedia big
data is smart city surveillance.
• Other applications: The applications of multimedia big data can be categorized
as health informatics, smart TVs, Internet of Things (IoT), disaster management
system, etc. The biomedicine data and healthcare data are considered as the
primary origin of the multimedia big data. It consists of variety and a huge size of
data such as patient records, medical images, physician prescription, etc. Kumari
8 Sharmila et al.

et al. [22] examined the part of IoT, fog computing, and cloud computing for
health care service.

2 Definition and Characteristics of Multimedia Big Data

Multimedia big data is the theoretical concept. There is no particular description for
multimedia big data. Multimedia big data concept differs from big data in terms of
heterogeneous, human-centric, different forms of media, and larger size as related to
the typical big data.
Some of the features of multimedia big data are given below:
• Multimedia big data comprises an enormous number of data types as compared to
traditional big data. Multimedia datasets are more understandable by a human as
compared to the machines.
• The multimedia big data is more difficult to processing as compared to traditional
big data becausem which consists of different types of audio, and videos data such
as interactive videos, stereoscopic three-dimensional videos, social videosm and
so forth.
• It is challenging to model and characterize the multimedia big data as these data are
collected from diverse (heterogeneous) sources such as pervasive portable mobile
devices, the sensor-embedded devices, the Internet of Things (IoT), Internet, digital
games, virtual world, and social media.
• It is thought-provoking to analyze the content and context of multimedia big data,
which is not constant over a period of time and space.
• Security of multimedia big data is complicated due to rapid increases in the sen-
sitive video data on communication.
• There is a necessity to process the multimedia big data swiftly and uninterrupt-
edly in order to cope with the transmission speed of the network. For real-time
computing, the multimedia big data is needed to be stored in order to transfer the
enormous amount of data in real time.
From the above discussed characteristics, it is observed that the scientific multime-
dia big data leads to some fundamental challenges such as cognition and understand-
ing complexity, analyzing complex and heterogeneous data, difficult to manage the
security of distributed data, quality of experience, quality of service, detailed require-
ments, and performance restriction that arises from multimedia big data applications.
The abovementioned challenges are associated with processing, storing of multime-
dia big data, transmission, and analysis, which leads to more research directions in
an area of multimedia big data.
Figure 1 shows the diverse sources of multimedia big data. The term big data is
used to refer those datasets, which could be no longer handled by traditional data
processing and analyzing application software because of a large volume of size
and complexity. The massive volume of datasets is both structured and unstructured,
which is very challenging to perform different types of task such as querying, sharing
Introduction to Multimedia Big Data Computing for IoT 9

Fig. 1 Different sources of multimedia big data

of data, transferring, updating, collecting, storing, visualizing, analyzing, security,


and privacy. The unstructured data does not have any fixed row and column for-
mat. Examples of unstructured data are picture files, auditory files, audiovisual files,
webpages, and different kinds of multimedia contents.
It does not fit appropriately into a database. As compared to structured data, the
unstructured data proliferate every second. The two different data of unstructured
dataset are the captured data and user-generated. The captured data is generated
based on users behavior. A user itself generates user-generated data. Examples of
user-generated data are comments, posts, photos, and videos posted by a user on Face-
book (Facebook.com 2016), Twitter (Twitter.com 2016), tweets, re-tweets, YouTube
(Youtube.com 2016), etc. The structured data types refer to that data which has a
static size and organized. It could be managed and stored easily in a database.

2.1 Challenges of Multimedia Big Data

As compared to the traditional big data(text-based big data), the multimedia big data
has more challenges related to basic operations like storing of enormous datasets,
processing, transmission, and analysis of data. Figure 2 depicts the multimedia big
data and its challenges.
10 Sharmila et al.

•Data Types: Videos, audio, text, IoT


Multimedia Data devices, Social networks, etc.
Abstraction •Challenges: Volume, real-time,
unstructured, noisy, uncertainity, etc.

•Data storage: RDBMS, MMDBMS,


NoSQL, Graph DBS, ORDBMS, Key
Multimedia value stores, etc.
Database •Challenges: store, manage,
extract/retrive, unstructured data and
heterogenous data sets

•Sharing system: Cloud, online file


sharing system, wireless data sharing
Multimedia data •Challenges: More storage, Bandwidth,
sharing maximum file size, data types, human
efforts

•Data Processing: Data cleaning, Data


transformation, data reduction, etc.
•Feature Analysis: Videos, Audios,
textual, motion, spatiotemporal, etc.
Multimedia Data
•Machine learning: Supervised ,
Mining unsupervised, semi-structured, etc.
•Challenges: Multimodality data
• representation, Complexity, noisy, semi-
structured data efficiency, real time,
accuracy

Fig. 2 Multimedia big data and its challenges

The following points are some of the challenges of multimedia data:


• Real time and quality of experience requirements: The services provided by
multimedia big data is on real time. It is difficult to addresses the problem of
Quality of Experience and its requirements, which needs to perform real-time
streaming online, concurrently process the data for analysis, learning, and mining.
• Unstructured and Multimodal data: The representation of multimedia big data
is challenging to store, and modeling due to unstructured and multimodal data
which is acquired from heterogeneous sources. It is very thought-provoking to
Introduction to Multimedia Big Data Computing for IoT 11

transform unstructured multimedia data into structured data and representation of


multimedia big data due to the data gathering from different sources.
• Perception and understanding complexity: Multimedia data cannot be readily
understood by computer due to the high-level and low-level semantics gap between
semantics. Furthermore, multimedia data vary for time and space.
• Scalability and efficiency: Multimedia big data systems are required to perform
huge computation, so it must enhance communication resources, computation, and
storage resources.
The above fundamental challenges lead to four logical problems as follows:
1. Representation and Modeling: In what way the unstructured data is converted
into structured datasets? How to create representation and modeling for the mul-
timedia data gathered from heterogeneous sources, unstructured data, and mul-
timodal data?
2. Data Computing: How effectively can we can perform data mining and learning
to examine the data?
3. Online Computing: In what way concurrently analyze, process, data mining,
and learn the real-time multimedia data received in a parallel way?
4. Computing, storage, and communication optimization: In what way design a
multimedia architecture to efficiently use storage, processing, and communica-
tion?

3 The Relationship Between IoT and Multimedia Big Data

In the rapid development of the IoT, a huge number of sensors are set into the numer-
ous devices from personal electronics applications to industrial machines, which are
connected to the internet. The embedded sensors are acquired from various kinds
of datasuch as home appliances, environmental data, scientific data, geographical
data, transportation data, medical data, personal human data, mobile equipment data,
public data, and astronomical data. The multimedia big data, which collects from
IoT devices have diverse characteristics as compared with typical big data due to
the diverse characteristics of sources such as heterogeneity, different types of data
(video, audio, and image), unstructured feature, noise, etc.
According to the report by IHS Markit, by 2030, the number of connected IoT
devices can exceed 125 billion, and then an enormous amount of IoT data generated.
Current technologies available to process the multimedia big data is not enough to
face challenges in the future era. Many IoT operators realize that the importance
and advancement of multimedia big data on IoT. It is essential for adopting the
applications of IoT on the development of multimedia big data. The rapid growth
of IoT, an enormous amount of multimedia data provides more openings for the
growth of multimedia big data. These two well-known technological developments
are mutually dependent on each other and should be developed together, which also
provides more openings for the research on IoT.
12 Sharmila et al.

4 Multimedia Big Data Life cycle

The emergence of IoT device is having a more significant impact on multimedia big
data life cycle. The fundamental challenges addressed with the help of multimedia
life cycle stages.
The figure shows the different stages of a multimedia life cycle, which consists of
data collection, processing, storage, dissemination, and presentation [23]. Figure 3
depicts the multimedia big data life cycle and Fig. 4 shows the key technologies of
multimedia big data.

4.1 Generation and Acquisition of Data

Data Generation. The first phase of multimedia big data life cycle is data generation.
The best example of multimedia big data is Internet data. A large amount of Inter-
net data is generated from surfing data, forum posts, chat records, blog messages,
and videos. These data are day-by-day activities of people’s lives, which is gener-
ated from diverse heterogeneous sources such as camera clicks, sensors, videos, etc.
The primary sources of multimedia big data are sensing information from connected
devices (Internet of Things), data generated from scientific research, people’s com-
munication and location information, trading datasets in enterprises, etc. Multimedia
big data is mainly generated from IoT, which is the primary source of big data. Big
data are generated from IoT-enabled smart cities, industries, agriculture field, traffic,
transportation, medical data, public department, etc.

Fig. 3 Different phases of the multimedia life cycle


Introduction to Multimedia Big Data Computing for IoT 13

Fig. 4 Key technologies in multimedia big data


14 Sharmila et al.

Data Acquisition. Acquisition is the first phase of the multimedia life cycle to get
multimedia data from heterogeneous sources, Internet of Things (IoT), sensor, actua-
tor, social media, digital games, etc. Different types of multimedia big data are gener-
ated from the sources such as audio, 2D, 3D virtual worlds, videos from the camera,
online streaming videos, social video, Hypertext Markup Language (HTML), tables,
etc. Recently, researchers proposed many standards for video coding. As compared
to typical big data, it has a high level of difficulty in acquiring data from different
sources due to the unstructured way of data representation. The unstructured datasets
are proliferating regarding volume, size, and quality. These features of multimedia
big data can offer opportunities to design new representation methods to deal with
complex and heterogeneous datasets. Table 1 depicts the comparison of multimedia
big data sets with other datasets such as representative dataset and big data.
IoT Multimedia big data generation and acquiring. To process an acquired
multimedia data from IoT devices and for transmission, the network layer is divided
into different layers such as the physical (sensing) layer, application layer, and the
network layer. The acquisition is carried out by sensing layer, which consists of
sensor networks. The information transmission and processing are carried out by the
network layer. The sensor network is responsible to perform transmission with in
the range and long distance transmission is carried out with the help of internet. The
application services of the Internet of Thing are carried out by the application layer.
The features of data generated from IoT as follows:
• Large-scale multimedia data;
• Heterogeneity;
• A limited amount of data due to noises;
• Robust time and space correlation.

Table 1 Characteristics of multimedia big data, typical datasets, and big data
Characteristics Typical datasets Big data Multimedia
Volume Less Medium Big
Data size Definite Uncertain Uncertain
Inferring video Not at all No Yes
Representation of data Structured data Structured data Unstructured data
Real-Time Not at all Yes Yes
Human-centric Not at all No Yes
Response No No Yes
Data source Centralized Heterogeneous Heterogeneous
distributed distributed
Complexity Low Medium High
Introduction to Multimedia Big Data Computing for IoT 15

4.2 Data Compression

The size of multimedia big data decreased to store, communicate, and process the data
efficiently. Multimedia data compression refers to eliminate the redundant data in
the dataset. Redundant data refers to duplications or additional data in the datasets,
which increases the data inconsistency, storage space, data transmission cost and
delay, and reduction of data reliability.
Feature-transformation-based data compression: The numerical data reduc-
tion is carried out by compressive sensing and wavelet transform.
• Cloud-based compression: A large amount of multimedia data is produced today
with the advent of IoT era. In the current scenario, many organizations are moving
toward the cloud to store an enormous volume of multimedia data, which leads to
storage issues in cloud computing. The storage issues are related to space, time,
access control, validation, etc. Facebook has declared that 300 billion pictures are
shared per day. Microsoft has announced that its cloud storage service accommo-
dates approximately 11 billion pictures. Many efficient compressing techniques are
available regarding space and time to store multimedia data efficiently in a cloud.
Subsequently, research on multimedia data compression for cloud computing is
of increasing importance in the computer society.

4.3 Multimedia Data Representation

The multimedia data which received from the different sources and each source
represents the data in different format. For multimodal analysis, it needs a common
representation of data. Multimedia data representation comprises of the following
different methods:
1. Feature-based data representation: Some features of multimedia big data are
standard regarding space or time; feature-based data representation is used to
extract the data among all different combination of features. Currently, many
types of research are being carried on feature vectors to retrieve the content-
based multimedia data. According to the applications, from the audio, video
streams, or image pixels, the features are extracted and combined into vectors.
The application-based approach leads to scalability, and accuracy lack in feature-
based data representation.
2. Learning-based representation: The common feature space extraction is a chal-
lenging task in multimedia big data due to the large volume of data gathered from
different sources. A new representation which used to extract the hidden space
is called learning or machine based representation.
Many learning-based representation approaches have been suggested to signify
multimedia big data. Predominantly, in recent years, deep architectures are exten-
sively applied for data learning representation.
16 Sharmila et al.

4.4 Data Processing and Analysis

Once the data is acquired and stored, the next phase of the life cycle is data processing
and analysis. The raw multimedia data, which is received from different heteroge-
neous sources are unstructured and noisy. The unstructured large-scale multimedia
datasets are not directly suitable for analysis because of sparse, noisy, and diverse
data, which causes troublesome and sometimes unfeasible. The problem as men-
tioned earlier can be alleviated by preprocessing methods. Data preprocessing is the
process of conversion of unusable data into new and cleaned data for further analysis.
After the data preprocessing, the datasets are ready for further higher level analysis.
Multimedia preprocessing of data comprises data cleaning, data transformation,
and data reduction [24] as follows:
Data Cleaning: According to the reports, data scientists are spending almost
60% of the time on data organizing and cleaning. Data organization and cleaning
[23] comprises of noise reduction, acquisition, outlier identification, and avoiding
inconsistencies. Data cleaning can improve the data quality and reduce the discrep-
ancy and faultiness of data. Data imputation methods have been used to handle the
missing data values. To improve the final results, error-aware data mining approach
incorporates the noise information in it. The noisy semi-structured data is converted
into clean data with the help of data manipulation and preprocessing tools.
Data Integration and Transformation: Data integration is the process of com-
bining the heterogeneous sources, as well as, their metadata into a consistent source.
It detects data conflicts and resolves it. Data transformation is another crucial step in
preprocessing. Data transformation includes data formatting, aggregation, and nor-
malization. Recently, extensive research work [25] is going on to develop a common
representation model to transform different data into enhanced, and simplified data.
Data reduction: Recently, many data compression techniques is proposed to
handle a large amount of multimedia data. Researches mainly focused on feature
reduction and instance reduction. In instance reduction technique [26], the quality of
mining model is improved by reducing the original datasets as well as the complexity
of the data without affecting the original data structure and integrity of the data.
• Data Analysis: As multimedia big data research is advanced due to the develop-
ment of IoT, the typical data analysis is a new complication on multimedia big
data processing. A generally big data analysis is narrowed down to the single data
format.
• Feature Analysis: The current explosion of multimedia data increases the com-
plications of data analysis as well. Feature extraction is connected to the gap
between low-level multimedia characteristics into its high-level semantic content.
It is time-consuming task to extract the features from massive datasets, and for that,
the whole process is parallelized and shared among numerous systems. Recently,
the fast feature extraction method is studied [27], and compared the three big data
techniques for multimedia feature extraction such as Apache Hadoop, Apache
Strom, and Apache Spark. Schuller et al. [23] studied how to extract the features
directly from compressed audio data.
Introduction to Multimedia Big Data Computing for IoT 17

• Deep learning Algorithm: Many researchers have been motivated by the pop-
ular Deep Learning toolboxes to extract large-scale features using deep learning
algorithms. Deep learning has mainly focused on unsupervised feature learning
and based on deep learning, a very less amount of work has been carried out on
multimodal features. An audiovisual speech classification framework using three
learning techniques are fusion-based method, a cross-modality, and shared repre-
sentation learning method. In the mid-2000s, feature reduction techniques were
proposed for large scale real time multimedia data. Online feature selection (OFS)
in which an online learner is only allowed to maintain a classifier involved only
a small and fixed number of features. The group and nonlinear feature selection
methods are based on Adaptive feature scaling to increase the performance and
speed of the training process.
• Machine Learning: Machine learning is the procedure of improving the perfor-
mance of computer programs by learning the data automatically through expe-
rience. The main purpose of machine learning is to learn a specific work whose
class tag is unknown. The supervised and unsupervised learning are the classifi-
cations of machine learning. In unsupervised learning, there is no label related to
each data instance input. The Supervised learning use an algorithm to learn the
mapping function from the input to the output.

4.5 Storage and Retrieval of Multimedia Data

The multimedia big data management and recovery are carried out with the help
of annotation due to the unstructured and heterogeneity of data. Annotation [12] is
categorized as the manual and automatic annotation. The manual annotation [28]
is done by users, source providers, and tools. The automatic annotation is carried
out by machine learning algorithms. The automatic annotations are more interest-
ing as compared to manual annotation due to the endlessly growing data. The main
problem of automatic annotation is a semantic gap. From the multimedia text docu-
ments, the semantic data are extracted by using Latent Dirichlet Modeling (LDM).
Currently, the deep learning techniques have been used widely to extract annotations
for videos and pictures. Generally, the Multimedia Database Management System
(MMDBMS) consists of multimedia data and their relationship, which is different
from traditional relational database management system. The characteristics of the
multimedia database are storage, constraints on spatial and temporal, presentation
of data, retrieval, etc.
The main requirements of the multimedia database are traditional database capa-
bilities, data modeling, storage management, retrieval, integration of media, inter-
face, and interactivity, and performance. The multimedia database management sys-
tem requires to satisfy the following requirements to perform the manipulation and
storage efficiency:
18 Sharmila et al.

• Data modeling for multimedia. Even though the various traditional database
modeling is available such as relational modeling, semantic, and network mod-
eling, only few modeling methods proposed for multimedia databases due to the
unstructured nature of multimedia data. For each type of media, the multimedia
data needs an object-oriented data model. The modeling system for the multime-
dia document, which combines the technologies such as Object-Oriented Database
Management System, Natural Language Processing (NLP), etc., to excerpt the vital
information, structure the input documents and offers semantic recovery. The data
modeling is mainly used to extract/retrieve the information.
• High volume storage management. The storage management of multimedia char-
acterized by significant volume and variety which need a hierarchical structure.
The hierarchical storage of multimedia big data increases the storage size and
decreases the performance.
• Query support and retrieval capabilities. Multimedia data needs different
queries such as content and keyword. The multimedia query typically does not
return an exact match; it returns a result which contains an object similar to the
query object. The multimedia consists of different media types, which require
consistent ranking and pruning approaches.
• Media Integration, configuration, and presentation. The integration and con-
figuration play an essential role; once unstructured data are converted into a struc-
tured data format. It ensures the truthfulness and individuality of multimedia data.
The multimedia big data require an efficient and effective presentation to reduce
excessive computation storage.
• Performance. The performance is an essential parameter of multimedia big data,
such as competence, consistency, processing of data on real-time and execution,
Quality of Service (QoS), Quality of Experience (QoE), and guaranteed mul-
timedia presentation. These performances are achieved with the help of cloud
computing and distributed processing.
• Multimedia Indexing. Generally, the traditional RDBMS is not appropriate for
multimedia big data because of unstructured data format. This problem solved with
the help of indexing approaches. The indexing approaches have been proposed to
manage the different data types and queries. Artificial Intelligence (AI) and non-
artificial intelligence are the types of indexing approaches.

4.6 Assessment

Advancements of information technologies and MEMS (Micro Electro Mechanical


Sensor) technologies and its extensive growth in numerous areas resulted in an enor-
mous amount of different data such as videos, audios, and text data. Due to the rapid
development of multimedia data and services, it is vital to provide the Quality of
Experience (QoE) to the users. Either the subjective or objective analysis cis arried
out to test the quality of the videos. The subjective analysis is carried out in a test
center which needs more human resource and expense. Generally, the subjective
Introduction to Multimedia Big Data Computing for IoT 19

Fig. 5 Characteristics of multimedia big data

assessment is not carried out for real-time estimation. The objective test depends on
the standard of Human Visual System (HVS). The objective assessment analysis is
based on subjective assessment test parameters.

4.7 Computing

From the enormous amount of multimedia data, it is a challenging task to organize


and process the multimedia big data. Multimedia big data computing is a novel
paradigm; the data analytics is performed by combining large-scale computation
with mathematical models.

5 Characteristics of Multimedia Big Data

A multimedia is a group of enormous and complicated datasets. Figure 5 shows the


characteristics of multimedia big data. Figure 6 shows the five V’s of multimedia big
data. The following characteristics can describe it,
Volume: In big data, the volume defined as the vast volumes of data generated
through the internet of things, portals, internet, etc. According to Worldometers 2016,
above 7.4 billion people (Worldometers 2016) are in the world, and almost 2 billion
peoples are linked to the internet, and remaining individual people are using various
20 Sharmila et al.

Fig. 6 Five V’s of multimedia big data


Introduction to Multimedia Big Data Computing for IoT 21

portable handheld devices, i.e., mobile devices. As a result of this technological


development, each product produces huge volume of multimedia data through the
growth of Internet technology and the use of various devices. Especially, remote
sensors embedded in the devices produce the heterogeneous data continuously either
in a structured or unstructured format. In the near future, the exponential growth
of multimedia data exceed yottabytes (1024 ). For example, more than one billion
users (YouTube.com 2016) are daily uploading videos over 300 h/min on YouTube.
The Facebook comprises more than 1.4 billion users, 25 trillion posts as on 2016
(StatisticsBrain 2016), and a total of 74 million Facebook pages. In 2016, 6.2 billion
gigabytes of global mobile traffic is estimated per month. According to the report
of Digital universe study of International Data and EMC Corporation, the data has
been generating tremendously, i.e., 800 EB in 2009–1.8 ZB in 2011, and in near
future, data grow 40 times (40ZB) greater in 2020. It is very challenging to handle
such amount of multimedia big data [26] concerning gathering, storage, analyzing,
preprocessing, sharing, and visualization.
Velocity: The term velocity denotes the rate at which data has been generated,
i.e., how fast the data is coming in. Hendrickson et al. [29], reports that information
proliferates by one order of scale every 5 years. Every day, 5 billion users browse
the internet, tweet, upload, and send both multimedia and standard data. The people
generates 58 million tweets and 2.1 billion queries in tweeter per day. The number of
users using YouTube increased to 40% since March 2014. Almost 50% of Facebook
account holders log into Facebook account every day. Every minute, about 2 million
searches and queries in Google (Google.com 2016) and Google processed 25 PB
every day. The efficient management tools and techniques are required to cope up
with the speed of multimedia big data.
Variety: The term variety refers to the diversity of data [29]. Examples of variety
are emails, voicemails, video message, ECG reading, audio recording, etc. In the age
of multimedia big data, the data gathered from heterogeneous sources are represented
by either images or videos. It contains more information and knowledge. Generally,
sources generate structured and unstructured data. Unstructured data does not have
any fixed format which is very difficult to process. The similar formats and predefined
lengths are referred to as structured data. The unstructured data can be processed with
the help of Hadoop; the clustering method used to process the unstructured data in a
short interval of time. The unstructured multimedia big data brings more challenges
for analyzing, preprocessing, and extracting the valuable data.
Veracity: In multimedia big data, the term veracity denotes the uncertainty of
data, noise, and deviation in data. It is very challenging issues in multimedia big data
to ensure the precision of data which make it as difficult to determine how much data
can be reliable.
Value: Value is the most critical element in multimedia big data. It denotes the
usage and retrieval of the valuable information from these huge volumes and diversity
of data. For the analysis of data, it is essential to filter, sort, and select data.
The other essential V’s of multimedia big data is as follows:
22 Sharmila et al.

• Visualization: The essential challenging characteristics of multimedia big data


are in what way the data is visualized. The technical challenges confronted by
tools available for visualization is due to the limitations of memory, functional-
ity, expandability, and response time. It is not possible to plot a billion of data
points using traditional graphs. The multimedia big data need different methods
of representing data such as data clustering, parallel coordinates, circular network
diagrams, sunbursts, etc.
• Vulnerability: Vulnerability refers to security concerns about multimedia big data.
• Validity: Validity denotes the correctness of the data for its envisioned use.
• Variability: It refers to the number of inconsistencies in the data, as well as, the
speed at which multimedia data loaded into your database.

6 Multimedia Big Data Challenges and Opportunities

With the proliferation of IoT, the world has marched into multimedia big data. The
development of multimedia big data provides a lot of challenges as well as countless
chances for the betterment of IoT applications.

6.1 Acquisition Challenges

Many different types of multimedia are videos, audios, speech, online streaming
videos, documents, graphics, geospatial data, 3D virtual worlds, etc. Multimedia big
data is unstructured data, which have more complexity in an analysis as compared to
typical big data. The unstructured data can be easily understandable by users which
proliferate regarding quantity and quality. It is difficult to understand by the machines.
These are the main challenges of multimedia big data acquisition. Some of the papers
addressed these issues are as follows: the representation and modeling of multimedia
big data is a very challenging task. Most of the studies focused on graph structure
instead of video structure. Generally, the large-scale multimedia big data is acquired
from the source, which contains the data in the form of incompleteness, uncertainty,
communication errors, also affect from malicious attack, data corruptions mainly
ignored the hidden video content and different levels of quality.
In BigKE method presents the knowledge framework to handles disjointed knowl-
edge and E-learning methods which receives the data from heterogeneous sources.
The streams feature is derived from spatial and temporal information. Wu et al. [30]
presents a tag assignments stream clustering for dynamic unstructured data, which
is modeled as a stream to describe the properties and interest of users. Hu et al. [11]
proposed a model to manage the multimedia big data using semantic link network,
which creates the relationship among different multimedia resources.
Introduction to Multimedia Big Data Computing for IoT 23

Table 2 Existing methods of acquisition process


Methods Objectives Limitations
BigKE [30] Knowledge framework to Not addressed IoT
handles disjointed knowledge
exhibiting and E-learning
methods from numerous
heterogeneous sources
Semantic link network model Manage the multimedia data Not addressed issues on IoT
[11] using semantics
Wang [35], Pouyanfar et al. Addressed the review of Not addressed issues on IoT
[44] multimedia big data
Kumari et al. [31] Addressed the taxonomy and Focused on IoT
multimedia big data for IoT

Multimedia data acquisition for IoT application is categorized as three parts,


namely, data gathering, compression, and representation [31]. Table 2 shows the
pros and cons of existing acquisition process.
In context of IoT, the multimedia data is often collected from sensors. The data
collection has been carried out from several areas such as forecasting health status
of patient, wireless networks, Internet of Multimedia Things (IoMT), Healthcare
I,ndustrial IoT (Health-IIoT) and personal devices. The multimedia big data collected
from the IoT devices are heterogeneous in nature. The main limitations of the existing
methods are each method has different views and categories. While designing a new
method for data acquisition, the following factors are considered such as unstructured
data, heterogeneous sources, multimodal, dynamic evolution, user’s interest, spatial
and temporal information, semantics, and geographically distributed data.

6.2 Compressing Challenges

The multimedia big data is a massive size of data; it must be compressed before
further processing and storage.
The compression of multimedia big data brings more challenges as compared to
traditional datasets and big data techniques. Due to the limited storage and process-
ing/computational capability, it needs to be compressed effectively with the help of
signal processing and transformation.
Many challenges arise while compressing multimedia big data as follows:
• Multimedia big data is difficult to handle because of unstructured data;
• Due to the large volume of data, it is challenging to compress at a fast speed;
• Data loss is very high due to diverse sources.
The traditional big data reduction approaches for compression are wavelet trans-
form and compressive sensing. Duan et al. [32] proposed the compression technique
24 Sharmila et al.

based on feature descriptor to attain large reduction ratio which depends on different
coding approaches. Bu et al. [22] proposed a deep learning-based feature extraction
context to extract the multilevel three-dimensional shape feature extraction. Xu et al.
[20] proposed a latent intact space learning to acquire abundant data information
by merging multiple views. Herrera et al. [33] proposed an architecture to handle
the data from various multimedia streaming stations such as TV and radio stations
to perform gather, process, analyze, and visualize data. The approaches mentioned
above is mainly focused on high-level integrated features in multiple views. The
effective description techniques are needed to extract high features. Most of the
existing approaches are not focused on the application of IoT. Table 3 shows the pros
and cons of existing methods of data reduction and collection.

6.3 Storage Challenges

Big volume of multimedia big data being is created continuously, and it is essential to
store the large volume of data after compression. The size of multimedia big data is
unlimited and has a variety of media types. With the massive evolution of multimedia
big data, the quality and amount of unstructured data bring more challenges to store
data as compared to typical big data. The storage system of typical big data is based
on the NoSQL. In multimedia big data scenario, it is impossible to store all real-time
streaming multimedia data. The limitation of existing storage methods is given in
Table 4.
The challenges of multimedia big data storage addressed in the new design regard-
ing feasibility and cost. Dede et al. [34] present a pipeline processing to combine
the NoSQL storage with big data processing platform (MapReduce). Wang et al.
[35] presented hybrid stream big data analytic models for multimedia big data to
addresses the data video analysis which contains data preprocessing, classification,
recognition, and big data overload reduction. Table 3 shows the limitations of existing
storage methods.
Liu et al. [20] present a hashing algorithm based on deep learning and shal-
low learning to efficiently store multimedia data, indexing, and retrieval. NoSQL-
based approach is introduced to manage real-time embedded database efficiently. It
is mainly designed to distribute data storage for an enormous amount of data needs,
which takes advantage of scaling. The concept of integrating the IP Multimedia Sub-
system (IMS) with the Hadoop system increases the performance, scalable distributed
storage, and computing system of IP multimedia subsystem service resources. While
designing the storage system for multimedia big data, the following features should
be considered to increase the performance and distributed storage. The boundaries
of IoT and cloud computing should be considered.
Introduction to Multimedia Big Data Computing for IoT 25

Table 3 Existing methods of data collection and reduction


Category Survey Area of Interest Pros Cons
Multimedia Gao et al. [39] Machine learning Outline of Not focused on
data collection feature analysis High- multimedia big
dimensional data and its
multimedia big technical
data and challenges on
machine IoT
learning
techniques
Hu et al. [11] Retrieval of video An overview on • Limited to
video indexing single
and retrieval multimedia
data type
• Lack of
multimedia
big data
challenges
Wang et al. [39] Smart grid A general Limited to
overview on multimedia big
multimedia data analytics
wireless sensor
networks and
its application
in smart grid
Madan et al. To predict health Low costs for Not addressed
[45] status of the patient providers wireless
networks issues
LUSTER [46], • Environmental • Data Prone to
Hossain et al. monitoring using reliability, Security
[47] WSN efficient breaches
• Data collected scalability
from Internet • Distributed
and fault
tolerant
storage
Duang et al. • Feature • High Not focused on
[32] descriptor based compression multimedia
on multimedia ratio data on IoT
coding
approaches
26 Sharmila et al.

Table 4 Existing methods of data storage


Multimedia big data Survey Methodology Limitation
storage Dede et al. [34] Combining NoSQL Need to consider IoT
with Big data platform and Cloud Computing
Wang et al. [35] • Addressed Video to increase the
data analysis performance and
• High stream big data storage
analytics
Liu et al. [20] • Hashing algorithm
depend on deep and
shallow learning

6.4 Processing Challenges

The fundamental task of the processing is to extract useful information for further
activities. The multimedia big data are generated from real-time applications. It is
essential to process the multimedia big data effectively with limited processing time.
It is essential to addresses the challenges of multimedia significant data processing
is as follows:
• The larger volume of multimedia big data is generated continuously from the het-
erogeneous sources. It needs to be processed at high speed to store data efficiently
in real-time.
• To handle the enormous amount of multimedia data, it needs to develop the auto-
mated and intelligent analytical technique to extract knowledge from heteroge-
neous data.
• To process the multimedia big data, it needs parallel/distributed, and real-time
streaming algorithms.
• Need large-scale computation, storage, communication resources, and networking
to process a huge volume of multimedia big data which should be optimized.
The sparsity, spatial-temporal information, and heterogeneity should be consid-
ered in future approaches for multimedia big data. The bottlenecks of processing
such as communication, storage, and computational should be reduced.

6.5 Understanding Challenges

With the massive evolution of multimedia big data, there is a semantic gap between
low-level and high-level semantics features. Multimedia big data is challenging to
understand by a device; particularly, certain multimedia big data changes with respect
to time and space. In order to understand the semantic gap in the multimedia big data,
the efficient cross-media and multimodal systematic tools, and intelligent analytical
methods are needed to overcome the limitations. Schuhmacher et al. [31] present a
Introduction to Multimedia Big Data Computing for IoT 27

knowledge graph which consists of objects, concepts, and relationships; to extract the
knowledge associations from different heterogeneous data. The concept and entities
reproduce the real-world concepts. The pattern matching methodologies is used to
extract the information from open source before constructing the knowledge graph.
The knowledge graph is used in different applications, such as big data analytics, deep
learning, to search semantic data, etc. Extensive research is needed to construct the
knowledge graph automatically to handle the huge scale of data. Recently, research
works are going on the structured and semi-structured text data. More research is
required on the unstructured knowledge graph to handle the large volume of multi-
media big data.

6.6 Computing Challenges

The multimedia big data is generated from the real-time environment. It is generating
continuously by more number of heterogeneous sources, which requires to process
uninterruptedly to store the data efficiently and time restrictions. Multimedia big data
consists of a wide variety of data and transient in nature. As a result, it is essential
to design the concurrent and instantaneous online streaming processing for scrutiny
of multimedia data. To perform computation on large-scale data, it is necessary to
optimize storage, communication resources, and processing. Due to the development
of communication technology, the multimedia big data travels through the network
at very high sapped; it brings the challenges of GPU computing. The cloud-based
system presented to harvest data from multimedia big data in the global camera
networks. In this method, it receives multimedia data from numerous devices, and
it can be evaluated instantaneously by an application programming interface. The
storage and computing of multimedia big data problem can be addressed by cloud
computing technology.
Sadiq et al. [36] addressed the several challenges of multimedia big data received
from crowded heterogeneous sources. The author presents a framework for spa-
tial multimedia big data and various multimedia sources. It also handled the spatial
queries related to multimedia big data in real-time. Cevher et al. [37] addressed the
bottlenecks in big data such as computational, storage, and communications. It also
28 Sharmila et al.

shows that advances in convex optimization algorithms propose several unconven-


tional computational choices. The synchronization problem in multimedia big data
is addressed to exploit the therapy recorder, which can implement the two-tier syn-
chronization process. It creates the multimedia synchronized therapy session file and
separates the complex media files. Garcia et al. [38] present a pipeline media con-
cept and Platform as a Service (PaaS) scheme. Zhang et al. [28] presents a method to
efficient precision recommendation method to recover the particular image from the
large size image database. The author presents three different types of content-based
imageretrieval, based on the content comparison the relevant image is recovered from
the image database. The authors have been analysed the platform to facilitate the pub-
lic cultural services based on cloud computing and Hadoop system. The fusion of
cloud computing and big data technology need to be considered to reduce computa-
tion time and improve data scalability. The data traffic can be classified based on the
local and structural features to process the data in real-time.

6.7 Security and Privacy Challenges

The multimedia big data consists of different datasets, which include private/personal
videos or sensitive videos [27]. With the explosion of videos, the multimedia big data
must be governed in complete security. It is challenging and essential to trace and
protect the multimedia big data. In order to manage and accesses the multimedia big
data securely, it is essential to study the implementation of a privacy policy. In the
context of IoT devices, the security of multimedia big data provides confidentiality,
integrity, and availability. Due to the huge volume and heterogeneous nature of data,
the security brings more challenges to deal with multimedia big data in IoT. In case
of centralized data, the single point of failure reveals the user’s information which
violates the laws. An outset of data mining in the real world dogged to privacy issue.
The encryption techniques are used to secure the confidentiality of multimedia big
data. At present, the methodologies/technologies available ensure only the privacy
of the static data. The protection of dynamic dataset is a challenging task.

6.8 Assessment Challenges

The Quality of Experience (QoE) plays a significant part in multimedia big data for
video applications. Some of the challenges of multimedia big data assessment are as
follows:
• It is a tedious task to measure and quantify user experience levels.
• In real time, it is complicated to keep track of multimedia big data applications.
• In what way to correlate Quality of Service (QoS) and QoE metrics effectively?
• In what way to obtain a standard for users?
Introduction to Multimedia Big Data Computing for IoT 29

• How efficiently analyses the customers’ experience?


• In what way quickly and accurately measure the QoE under various standards?
Wang et al. [39] present the importance of monitoring and analyzing the net-
work traffic to improve the customer experience and improving resource allocation
of networks. Liu et al. [20] present the methodology to monitor and analyze the
big data traffic with the help of Hadoop. Hadoop is mainly developed for batch
processing, later on, used for large-scale data processing. It is a license-free Java-
based distributed computing platform for big data. Google developed Hadoop for
big data applications. In Hadoop, the Java language is used to write MapReduce
code. The features of Hadoop are cost effective, high efficiency, scalability, tolerate
the fault, and distributed concurrency computing. The significant primary challenge
of Hadoop is tough to adapt it for network measurement. Recently, many research
works have been carried out on QoE problems. Adaboost model presented to achieve
higher accuracy which shows the relationship between the significances of the IPTV
set-up box and QoE. An user-centric QoE prediction methodology depends on var-
ious machine learning algorithms such as artificial neural networks, decision tree,
Gaussian Naïve Bayes classifiers, and vector regression tool. Sun et al. [4] present a
decision tree video to model datasets, which achieves good service and enhance the
users QoE. The main characteristics of this model provide the association between
users QoE and alarming data for IPTV. The cross-layer prediction method was pro-
posed to estimate the mobile video quality without any reference model. Recently,
much research has been carried out on user-centric analysis and a likelihood of QoE
based on a machine learning algorithm.

7 Opportunities

Despite all the challenges faced by the multimedia big data, it still offers considerable
opportunities to the Internet of Multimedia Things (IoMT) to advance the facilities
and applications through the efficient use of multimedia big data. With the prolifera-
tion of MEMS technologies, the IoT is well thought-out as one of the most transitions
in today’s technology. IoT offers more opportunities for multimedia big data analyt-
ics. Some examples of multimedia big data computing for IoT applications are as
follows:
E-Commerce In this era, the growth of multimedia big data is very high as
compared to traditional data. To process the large quantity of data in real time,
the multimedia IoT big data analytics provides well-designed tools for decision-
making. The integration of multimedia big data with IoT provides new challenges
and openings to construct a smart environment.
Social Media Analytics collects the data from social media such as Facebook,
Twitter, Google Plus, blogs, Wikipedia, etc., to analyses/statistics such data to gather
the knowledge. Most of the E-commerce vendors are gathering the social media
30 Sharmila et al.

analytics to gain business values, increase the sales and profits, customer satisfaction,
build companies reputation, and create brand awareness among people.
Smart Cities The development of multimedia big data and evolution of IoT tech-
nologies have played a significant role in the initiatives of smart cities. The integration
of IoT and multimedia big data is a promising new research area that brought more
interesting challenges and opportunities for attaining the goal of future smart cities.
IoT plays an important source for collecting a large amount of multimedia big data,
which needs high-speed processing, analysis, and transmission. Tanwar et al. [40,
41] proposed an advanced security alert system architecture for smart home using
pyroelectric infrared and raspberry pi module.
Healthcare: Big data has a huge potential to alter the healthcare industry. The
smart healthcare devices produce a huge amount of information such as ECG, tem-
perature monitors, sugar level, etc. The healthcare devices monitor real-time health
data of patient, which reduces the overall cost for the prevention and management
of illnesses. From the analysis of health data, the doctor could diagnose and detect
diseases at an early stage. Due to the high-speed access to the internet, many people
have started to utilize mobile applications to manage their health problems. These
mobile applications and smart devices are integrated and act as a Medical Internet
of Thing (MIoT).
The proper use of multimedia big data gathered from IoT increase economics,
productivity and bring new visions to the world. Based on a literature review, the
challenges are identified for multimedia big data analytics in Internet of Things (IoT).

8 Future Research Directions

Many of the organizations have widely acknowledged multimedia big data com-
puting for IoT applications. Still, multimedia big data for IoT is in primary stages.
Many current challenges have been not addressed. This section gives numerous chal-
lenges and its future research directions of multimedia big data computing for IoT
applications.

8.1 Infrastructure

In this era, the amount of data generated from the IoT devices exceeds the computer
resources. To analyze the multimedia big data, the manufacturers have to produce
a high-volume solid hard disk drive to handle a massive volume of data. The solid
disk drive replaced the conventional hard disk storage system. In the near future, a
powerful processor is needed to process an enormous amount of multimedia big data.
The diversity of real-time multimedia big data such as 3D graphics, audios, and videos
are processed by more efficient Central Processing Unit (CPU) virtualization ad I/O
virtualization needed, which reduces cloud computing cost. As already mentioned
Introduction to Multimedia Big Data Computing for IoT 31

that the primary source of generating multimedia data from the internet is IoT, which
would need large data warehouses. Hadoop and Spark techniques would be used
further to explore the data locality and transfer huge volume of big data to computing
units over traditional High-Performance Computing (HPC).

8.2 Data Security and Privacy

Data security and privacy play a significant concern in multimedia big data. Still, most
of the enterprises do not use the cloud to store their multimedia big data due to the
nonexistence of data visibility and privacy in this new infrastructure. As mentioned
earlier, data security and privacy have been a significant problem in the development
of technologies and mobile devices (MIoT). The security storage and management
of big data, privacy on data mining and analysis, access control are the possible
mechanisms for multimedia big data. It does not increase the computational and
processing cost. It should balance between access control and processing ease. The
efficient security mechanism such as encryption of multimedia data is required to
secure the multimedia big data. The privacy plays an significant issue in data mining.
The privacy of data is achieved by encryption, anonymity, and temporary identifica-
tion. Another security issue related to multimedia big data associated with IoT is the
heterogeneity sources used and the nature of different types of data generated. The
authentication of heterogeneous devices could be carried out by assigning a unique
identification to the respective device. The architecture of heterogeneous IoT has
increased the security risks to security professionals. Subsequently, any attack in
IoT architecture compromises the security of the system and cutoff interconnected
devices. The traditional security algorithm is not appropriate for IoT devices due
to a dynamic observation of data. The security problems encountered by IoT big
data are as follows: (a) dynamic updates—challenging to keep system updates, (b)
identifying illegitimate traffic patterns among legitimate ones, (c) interoperability,
(d) and protocol convergence—the application of Ipv4 security rules are not suitable
for currently compatible IPv6.
Currently, these challenges are not addressed to accomplish the privacy and secu-
rity of connected IoT devices. The subsequent strategies can overwhelme these dif-
ficulties such as (1) APIs is essential to evade compatibility and dependability prob-
lems. (2) IoT devices is well guarded while interconnecting with peers. (3) Protected
devices with best-hardcoded security practices to resist against threats.

8.3 Data Mining

Data mining plays a significant part in multimedia big data, which is used to extract
the interesting data for multimedia datasets. Multimedia datasets consist of structured
and unstructured data such as videos, audio, images, speech, text, etc. The multimedia
32 Sharmila et al.

data mining is further categorized as static and dynamic media. Examples of static
media are text and images; dynamic media is audio and videos. The multimedia
data mining mentions that the analysis of a massive amount of multimedia big data
gathered from IoT devices to excerpt the useful information pattern depend on their
statistical relationship.
Data mining methods offer solutions for multimedia big data to generalize for new
data. IoT brought the challenges of data extraction. The main challenges related to
data mining and processing are knowledge discovery, processing, and data. A large
amount of big data faces the challenges due to volume, openness, exactness, and
heterogeneity regarding data sources and data type. The big data sets are more irreg-
ularities and uncertainties in nature, which require additional preprocessing such as
cleansing, reduction, and transmission. Recently, researchers have presented pro-
gramming models based on concurrent, and serial processing and diverse algorithms
are proposed to reduce the response time of query on big data. Researchers have an
opportunity to address the bottlenecks of data mining in big data IoT.

8.4 Visualization

In big data analytics with IoT systems, the visualization plays a vital role in deal-
ing with a large amount of data are generated. Visualization is difficult to process
because of a large amount of data and its different dimensions. It is necessary to work
faultlessly in big data analytics and visualization to obtain better results from IoT
applications. Visualization is a difficult task in big data because of heterogeneous
and various types of data such as structured, unstructured and semi-structured data.
Designing visualization for IoT big data is an arduous task. Wang et al. [42] addressed
the challenges and technology progress of visualization and big data. The visualiza-
tion software is used to visualize the fine-grained dimensions based on the concept of
the locality reference as well as the probability of identifying the correlations, outliers,
and patterns. The concurrency is a thought-provoking task in visualization to man-
age the IoT big data. Most of the visualization tools used for IoT produced deprived
performance regarding scalability, functionality, and response time. Gorodov et al.
[43] addressed real-time analytics for IoT architecture issues of data visualization
concerning the application of big data such as visual noise, information loss, great
image observation, high-performance requirements due to the dynamic generation
of data in an IoT environment.

8.5 Cloud Computing

The advancement of visualization technologies has motivated the development of


cloud computing technologies. The cloud computing technologies characterized by
virtual computers are constructed on top of the computing infrastructure. The main
Introduction to Multimedia Big Data Computing for IoT 33

limitations of cloud computing are cost of the massive amount of data storage, control
over the distributed environment, security, privacy, and transfer. All these limitations
is considered for future research directions.

8.6 Integration

Data integration denotes that the different formats of data can be viewed as uni-
form. Integration offers a single point of view of the data, which is gathered from
heterogeneous sources. Multimedia big data are generated from different sources
continuously. The produced data can be classified into three groups, namely, (1)
structured, (2) unstructured, and (3) semi-structured. It extracts the information from
different datasets. It is a challenging task to integrate different data types, and over-
lapping of the same data increases the scalability, performance, and enable real-time
data access. These challenges related to the integration of data should be addressed
in the near future.

8.7 Multimedia Deep Learning

The application of deep learning in computer vision, NLP, speech processing, etc.,
are growing faster as compared to current research in deep learning on multimedia
big data analysis. The deep learning analysis of multimedia big data is yet in its initial
stage of development. The different modularity of data needs to be analyzed using
multimodal deep learning techniques. The future deep learning research is mainly
focused on dealing with heterogeneous sources, high-dimensional, and un-named
multimedia data. As compared to the traditional machine learning approaches, the
computational efficiency of deep learning is remains a big challenge; because of
the massive amount of resources and more training time is needed. The efficiency
of deep learning techniques can be increased by using clusters of GPU. Lacey et al.
[37] suggested the use of Field-Programmable Gate Arrays (FPGA) on deep learning,
which provides an optimization, the large degree of parallelism and reduces the power
consumption as compared to the GPU.

9 Summary

This chapter has presented the sophisticated approaches developed for multimedia
big data analytics. The emergence of multimedia big data opens opportunities and
draws more attention to researchers. First, introduce the general background of big
data, challenges, and its application in multimedia big data. This chapter provides an
extensive overview of the multimedia big data challenges, the impact of multimedia
34 Sharmila et al.

big data in IoT, and characteristics have been discussed in 10 V’s perspective. The
different phases of multimedia big data such as data generation and acquisition, data
representation, compression, processing and analysis, storage and retrieval, assess-
ment, and computing have been discussed. Further, a comprehensive and organized
framework has been discussed for each stage such as background, technical chal-
lenges, and review the recent updates in the area of multimedia big data. In addition,
a list of opportunities for the multimedia big data has also been provided. In spite of
all the challenges faced by the multimedia big data, it still offers huge opportunities
to the Internet of Multimedia Things (IoMT) to advance the services and applications
through the efficient use of multimedia big data. Many organizations acknowledged
the development of multimedia big data for IoT applications. Multimedia big data
for IoT is in the primary stage. These discussions aim to offer a broad overview,
and perspective to make the significant advances in multimedia big data for IoT that
meets futures requirement.

References

1. R. John et al., Riding the multimedia big data wave, in Proceedings of the 36th International
ACM SIGIR Conference on Research and Development in Information Retrieval (ACM, Dublin,
Ireland) pp. 1–2 (2013)
2. L. Mearian, By 2020, there will be 5,200 GB of data for every person on the
Earth, http://www.computerworld.com/article/2493701/data-center/by-2020-there-will-be-5-
200-gb-of-datafor-every-person-on-earth.html. Accessed 5 Apr 2016
3. E. Adler. Social media engagement: the surprising facts about how much time people spend on
the major social networks (2016), http://www.businessinsider.com/social-media-engagement-
statistics-2013-12
4. Mei-Ling Shyu, Shu-Ching Chen, Qibin Sun, Yu. Heather, Overview and future trends of
multimedia research for content access and distribution. Int. J. Semant. Comput. 1(1), 29–66
(2007)
5. J. Gantz, D. Reinsel, Extracting value from chaos. IDC iView 1–12 (2011)
6. K. Cukier, Data, data everywhere: a special report on managing information (2011)
7. Lohr S, The age of big data. N. Y. Times 11 (2012)
8. V. Mayer-Schonberger, K. Cukier, Big data: a revolution that will transform how we live, work,
and think. EamonDolan/Houghton Mifflin Harcourt (2013)
9. P. Zikopoulos, C. Eaton et al., Understanding big data: analytics for enterprise-class Hadoop
and streaming data. McGraw-Hill Osborne Media (2011)
10. E. Meijer, The world according to LINQ. Commun. ACM. 54(10), 45–51 (2011)
11. C. Hu, Z. Xu, Y. Liu, L. Mei, L. Chen, X. Luo, Semantic link network-based model for
organizing multimedia big data. IEEE Trans. Emerg. Top. Comput. 2(3), 376–387 (2014)
12. M. Beyer, Gartner says solving big data challenge involves more than just managing volumes
of data. Gartner, http://www.gartner.com/it/page.jsp
13. R. Cattell, Scalable SQL and NoSQL data stores. ACM SIGMOD Rec. 39(4), 12–27 (2011)
14. O.R. Team, Big data now: current perspectives from OReilly radar. OReilly Media (2011)
15. A. Labrinidis, H.V. Jagadish, Challenges and opportunities with big data. Proc. VLDB Endow.
5(12), 2032–2033 (2012)
16. R.E. Wilson, S.D. Gosling, L.T. Graham, A review of Facebook research in the social sciences.
Perspect. Psychol. Sci. 7(3), 203–220 (2012)
Introduction to Multimedia Big Data Computing for IoT 35

17. Z. Tufekci, Big questions for social media big data: representativeness, validity and other
methodological pitfalls, in Proceedings of the Eighth International Conference on Weblogs
and Social Media (Michigan, USA, 2014) pp. 505–514
18. D. Agrawal, P. Bernstein, E. Bertino, S. Davidson, U. Dayal, M. Franklin, J. Gehrke, L. Haas,
A. Halevy, J. Han et al., Challenges and opportunities with big data. A community white paper
developed by leading researches across the United States (2012)
19. N.D. Lane, E. Miluzzo, L. Hong, D. Peebles, T. Choudhury, A.T. Campbell, A survey of mobile
phone sensing. IEEE Commun. Mag. 48(9), 140–150 (2010)
20. Xu Zheng, Yunhuai Liu, Lin Mei, Hu Chuanping, Lan Chen, Semantic-based representing and
organizing surveillance big data using video structural description technology. J. Syst. Softw.
102, 217–225 (2015)
21. Mei-Ling Shyu, Zongxing Xie, Min Chen, Shu-Ching Chen, Video semantic event/concept
detection using a subspace-based multimedia data mining framework. IEEE Trans. Multimedia
10(2), 252–259 (2008)
22. A. Kumari, S. Tanwar, S. Tyagi, N. Kumar, Fog computing for healthcare 4.0 environment:
opportunities and challenges, Comput. Electr. Eng. 72, 1–13 (2018)
23. Gerald Schuller, Matthias Gruhne, Tanja Friedric, Fast audio feature extraction from com-
pressed audio data. IEEE J. Select. Top. Sign. Process. 5(6), 1262–1271 (2011)
24. K.R. Malik, T. Ahmad, M. Farhan, M. Aslam, S. Jabbar, S. Khalid, M. Kim, Big-data: trans-
formation from heterogeneous data to semantically-enriched simplified data. Multimed. Tools
Appl. 75(20), 12727–12747 (2016)
25. Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, T. Dar-
rell, Cafe: convolutional architecture for fast feature embedding, in Proceedings of the ACM
International Conference on Multimedia (2014), pp. 675–678
26. N. Khan, I. Yaqoob, I.A.T. Hashem, Z. Inayat, W. Kamaleldin, M. Ali, M. Alam, M. Shiraz,
A. Gani, Big data: survey, technologies, opportunities, and challenges. Sci. World J. 18 (2014)
27. David Mera, Michal Batko, Pavel Zezula, Speeding up the multimedia feature extraction: a
comparative study on the big data approach. Multimed. Tools Appl. 76(5), 7497–7517 (2017)
28. G. Zhang, Y. Yang, X. Zhai, W. Huang, J. Wang, Public cultural big data analysis platform, in
Proceedings of 2016 IEEE Second International Conference on Multimedia Big Data (BigMM)
(Taipei, Taiwan, 2016) pp. 398–403
29. S. Hendrickson, Getting started with Hadoop with Amazon’s Elastic MapReduce
(2010), https://www.slideshare.net/DrSkippy27/amazon-elastic-map-reduce-getting-started-
with-hadoop
30. X. Wu, H. Chen, G. Wu, J. Liu, Q. Zheng, X. He, A. Zhou, Z. Zhao, B. Wei, M. Gao, Y. Li, Q.
Zhang, S. Zhang, R. Lu, N. Zhang, Knowledge engineering with big data. IEEE Intell. Syst.
30(5), 46–55 (2015)
31. M. Schuhmacher, S.P. Ponzetto, Knowledge-based graph document modeling, in Proceedings
of 7th ACM International Conferrrence on Web Search and Data Mining (WSDM’14) (New
York, NY, 2014) pp. 543–552
32. L.-Y. Duan, J. Lin, J. Chen, T. Huang, W. Gao, Compact descriptors for visual search. IEEE
Multimed. 21(3), 30–40 (2014)
33. J. Herrera, G. Molto, Detecting events in streaming multimedia with big data techniques, in
Proceedings of 2016 24th Euromicro International Conference on Parallel, Distributed, and
Network-Based Processing (PDP), (Heraklion Crete, Greece, 2016), pp. 345–349
34. E. Dede, B. Sendir, P. Kuzlu, J. Weachock, M. Govindaraju, L. Ramakrishnan, Processing
Cassandra datasets with Hadoop-Streaming based approaches. IEEE Trans. Serv. Comput.
9(1), 46–58 (2016)
35. Z. Wang, S. Mao, L. Yang, P. Tang, A survey of multimedia big data. China Commun. 15(1),
155–176 (2018)
36. B. Sadiq, F. Ur Rehman, A. Ahmad, A Spatio-temporal multimedia big data framework for a
large crowd, in Proceedings of 2015 IEEE International Conference on Big Data (Santa Clara,
CA, 2015), pp. 2742–2751
36 Sharmila et al.

37. G. Lacey, G.W. Taylor, S. Areibi, Deep learning on FPGAs: past, present, and future. CoRR
abs/1602.04283 (2016). http://arxiv.org/abs/1602.04283
38. B. Garcia, M. Gallego, L. Lopez, G.A. Carella, A. Cheambe, NUBOMEDIA: an elastic PaaS
enabling the convergence of real-time and big data multimedia, in Proceedings of 2016 IEEE
International Conference on Smart Cloud (SmartCloud) (New York, 2016) pp. 45–56
39. X. Wang, L. Gao, S. Mao, BiLoc: bi-modality deep learning for indoor localization with 5 GHz
commodity Wi-Fi. IEEE Access J. 5(1), 4209–4220 (2017)
40. Tanwar et al., An advanced internet of thing based security alert system for smart home, in
International Conference on Computer, Information and Telecommunication Systems (IEEE
CITS-2017), vol. 21(2) (Dalian University, Dalian, China, 2017), pp. 25–29
41. S. Tanwar, S. Tyagi, S. Kumar, The role of internet of things and smart grid for the development
of a smart city, in Intelligent Communication and Computational Technologies, IoT4TD 2017
(Lecture Notes in Networks and Systems: Proceedings of Internet of Things for Technological
Development), vol. 19 (Springer International Publishing, 2017), pp. 23–33
42. L. Lin, G. Ravitz, M.-L. Shyu, S.-C. Chen, Effective feature space reduction with imbalanced
data for semantic concept detection, in Proceedings of the IEEE International Conference on
Sensor Networks, Ubiquitous, and Trustworthy Computing 262–269 (2008)
43. E.Y. Gorodov, V.V. Gubarev, Analytical review of data visualization methods in application to
big data. J. Electr. Comput. Eng. 22 (2013)
44. S. Pouyanfar, Y. Yang, S.-C. Chen, M.-L. Shyu, S.S. Iyengar, Multimedia big data analytics: a
survey. ACM Comput. Surv. 51(1):10:1–10:34 (2018)
45. A. Madan, M. Cebrian, D. Lazer, and A. Pentland. Social sensing for epidemiological behavior
change, in Proceedings of the 12th ACM International Conference on Ubiquitous Computing
(Ubi Comp’10, 2010) pp. 291–300
46. L. Selavo, A. Wood, Q. Cao, T. Sookoor, H. Liu, A. Srinivasan, Y. Wu, W. Kang, J. Stankovic, D.
Young, and J. Porter. Luster: Wireless sensor network for environmental research in Proceed-
ings of the 5th International Conference on Embedded Networked Sensor Systems (SenSys’07,
2007) pp. 103–116
47. M. Shamim Hossain and Ghulam Muhammad. Cloud-assisted industrial internet of things (iiot)
– enabled framework for health monitoring. Computer Networks, 101(Supplement C):192–202
(2016)

View publication stats

You might also like