0% found this document useful (0 votes)

860 views19 pages

Big Data Seminar Report

The document is a technical seminar report on the importance of big data analytics for business development submitted by Sneka K S to the Visvesvaraya Technological University. It discusses big data, which refers to massive datasets that are too large and complex for traditional data processing. Big data analytics involves analyzing these large datasets to reveal hidden patterns and correlations. This provides valuable insights and a competitive advantage for businesses. The report will present an overview of big data, its advantages and challenges, applications, and privacy concerns. It contains sections on the introduction, literature survey, concept, advantages and disadvantages, applications, and conclusion.

Uploaded by

Chinnu D S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

860 views19 pages

Big Data Seminar Report

Uploaded by

Chinnu D S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 19

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

BELGAVI, KARNATAKA

A TECHNICAL SEMINOR REPORT

“IMPORTANCE OF BIG DATA ANALYTICS

BUSINESS DEVELOPMENT”

Submitted By:
SNEKA K S
(4MG18CS038)

Under the guidance

of:
Mr. PRADEEP B M
Associate. Prof., Dept. of CS&E
GMIT, Bharathinagara

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

G.MADEGOWDA INSTITUTE OF TECHNOLOGY
Bharathinagara, Maddur Taluk, Mandya Dist-571422
2021-2022
G. MADEGOWDA INSTITUTE OF TECHNOLOGY
Bharathinagara, Maddur Taluk, Mandya Dist-571422

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

CERTIFICAT

Certified that the seminar work entitled “BIG DATA ANALYTICS” has been
presented by SNEHA K S (4MG18CS038) forth partial fulfilment of the Eighth
Semester, B.E, degree in Computer Science & Engineering of the Visvesvaraya
Technology University, Belagavi during the year 2021-22. It is certified that all
correction/suggestion indicated have been incorporated in the report. The seminar
report has been approved and certified as per the requirements.

Signature of the Guide Signature of the HOD

Mr. PRADEEP B M Mr. VIJAY KUMAR M S
Associate. Prof., Dept. of CS&E Associate. Prof., & Head,
GMIT, Bharathinagara Dept. of CS&E

GMIT, Bharathi
DECLARATION

The project entitled “BIG DATA ANALYTICS” was duly executed by us SNEHA
K S(4MG18CS038) Eighth Semester, B.E in Computer Science and Engineering, G
Madegowda Institute of Technology, Bharathinagara, under the guidance of Mr.
Pradeep B M Associate. Prof., Dept. of CS&E, G Madegowda Institute of
Technology, Bharathinagara, 2021-2022. We hereby declare that, the above-entitled
project work is executed only by us, for the partial fulfilment of the requirement for
the award of the Bachelor degree in Computer Science and Engineering prescribed
by Visvesvaraya Technology University “jnanaSangama”, Belagavi 590014.

SNEHA K S
(4MG18CS038)
ACKNOWLEDGEMENT

I feel great pleasure to acknowledge the guidance and assistance and assistance of all
those people who have made my work on this report pleasant endeavor.

I express my sincere gratitude to our principal Dr. PRAVEEN GOWDA,

PRINCIPAL, GMIT, Bharathinagara for providing facilities.

I wish to place on record my grateful thanks to Mr. VIJAY KUMAR M S,

associate professor and Head of the Department, Computer Science and
Engineering, GMIT, Bharathinagara for providing encouragement and guidance.

I hereby like to thanks Mr. PRADEEP B M, Associate professor, Department

of Computer Science and Engineering, GMIT, Bharathinagara on his periodic
inspection, time-to-time evaluation of the internship training and help to bring the
work to the present form.

Also, I thank the members of the faculty of Department of Computer Science and
Engineering, GMIT, Bharathinagara whose suggestions enabled us to surpass
many of theseemingly impossible hurdles. We also thank our guides and lastly, we
thank everybodywho has directly or indirectly us in course of this work.

SNEHA K S(4MG18CS038)
ABSTRACT

Big data is a term for massive data sets having large, more varied and complex
structure with the difficulties of storing, analysing and visualizing for further
processes or results. The process of research into massive amounts of data to reveal
hidden patterns and secret correlations named as big data analytics. These useful
information’s for companies or organizations with the help of gaining richer and
deeper insights and getting an advantage over the competition. For this reason, big
data implementations need to be analysed and executed as accurately as possible.
This paper presents an overview of big data's content, scope, samples, methods,
advantages and challenges and discusses privacy concern on it.
TITLE PAGE NO

1. Introduction 2

2. Literature Survey 3

3. Concept of Topic 4

4. Advantages and disadvantages 5

5. Application of Big Data 6

6. Conclusion 11

7. References 12
INTRODUCTION
Big data is a broad term for data sets so large or complex that traditional data
processing applications are inadequate. Challenges include analysis, capture, data
curation, search, sharing, storage, transfer, visualization, and information privacy.
The term often refers simply to the use of predictive analytics or other certain
advanced methods to extract value from data, and seldom to a particular size of
data set. Accuracy in big data may lead to more confident decision making. And
better decisions can mean greater operational efficiency, cost reductions and
reduced risk.

Analysis of data sets can find new correlations, to "spot business trends,
prevent diseases, combat crime and so on." Scientists, practitioners of media
and advertising and governments alike regularly meet difficulties with large
data sets in areas including Internet search, finance and business informatics.
Scientists encounter limitations in e-Science work, including meteorology,
genomics, connectomes, complex physics simulations, and biological and
environmental research.

Data sets grow in size in part because they are increasingly being gathered
by cheap and numerous information-sensing mobile devices, aerial (remote
sensing), software logs, cameras, microphones, radio-frequency identification
(RFID) readers, and wireless sensor networks. The world's technological per-
capita capacity to store information has roughly doubled every 40 months since
the 1980s; as of 2012, every day 2.5 exabytes (2.5×1018) of data were created; The
challenge for large enterprises is determining who should own big data initiatives
that straddle the entire organization.

Work with big data is necessarily uncommon; most analysis is of "PC

size" data, on a desktop PC or notebook that can handle the available data set.

Relational database management systems and desktop statistics and

visualization packages often have difficulty handling big data. The work instead
requires "massively parallel software running on tens, hundreds, or even thousands
of servers". What is considered "big data" varies depending on the capabilities of
the users and their tools, and expanding capabilities make Big Data a moving
target. Thus, what is considered to be "Big" in one year will become ordinary in
later years. "For some organizations, facing hundreds of gigabytes of data for the
first time may trigger a need to reconsider data management options. For others, it
may take tens or hundreds of terabytes before data size becomes a significant
consideration."
LITERATURE SURVEY
In 2000, Seisint Inc. developed C++ based distributed file sharing framework for
data storage and querying. Structured, semi-structured and/or unstructured data is
stored and distributed across multiple servers. Querying of data is done by
modified C++ called ECL which uses apply scheme on read method to create
structure of stored data during time of query. In 2004 LexisNexis acquired Seisint
Inc. and 2008 acquired Choice Point, Inc. and their high-speed parallel processing
platform. The two platforms were merged into HPCC Systems and in 2011 was
open sourced under Apache v2.0 License. Currently HPCC and System are the
only publicly available platforms capable of analyzing multiple exabytes of data.

In 2004, Google published a paper on a process called MapReduce that used

such an architecture. The MapReduce framework provides a parallel processing
model and associated implementation to process huge amounts of data. With
MapReduce, queries are split and distributed across parallel nodes and processed
in parallel (the Map step). The results are then gathered and delivered (the Reduce
step). The framework was very successful, so others wanted to replicate the
algorithm. Therefore, an implementation of the MapReduce framework was
adopted by an Apache open-source project named Hadoop.

MIKE2.0 is an open approach to information management that

acknowledges the need for revisions due to big data implications in an article
titled "Big Data Solution Offering". The methodology addresses handling big
data in terms of useful permutations of data sources, complexity in
interrelationships, and difficulty in deleting (or modifying) individual records.

Recent studies show that the use of a multiple layer architecture is an option
for dealing with big data. The Distributed Parallel architecture distributes data
across multiple processing units and parallel processing units provide data much
faster, by improving processing speeds. This type of architecture inserts data into
a parallel DBMS, which implements the use of MapReduce and Hadoop
frameworks. This type of framework looks to make the processing power
transparent to the end user by using a front-end application server.

Big Data Analytics for Manufacturing Applications can be based

on a 5C architecture (connection, conversion, cyber, cognition, and
configuration.
CONCEPT OF THE TOPIC

Big data usually includes data sets with sizes beyond the ability of commonly used
software tools to capture, curate, manage, and process data within a tolerable
elapsed time. Big data "size" is a constantly moving target, as of 2012 ranging
from a few dozen terabytes to many petabytes of data. Big data is a set of
techniques and technologies that require new forms of integration to uncover large
hidden values from large datasets that are diverse, complex, and of a massive
scale.

In a 2001 research report and related lectures, META Group (now Gartner)
analyst Doug Laney defined data growth challenges and opportunities as being
three-dimensional, i.e. increasing volume (amount of data), velocity (speed of data
in and out), and variety (range of data types and sources). Gartner, and now much
of the industry, continue to use this "3Vs" model for describing big data. In 2012,
Gartner updated its definition as follows: "Big data is high volume, high velocity,
and/or high variety information assets that require new forms of processing to
enable enhanced decision making, insight discovery and process optimization."
Additionally, a new V "Veracity" is added by some organizations to describe it.

If Gartner’s definition (the 3Vs) is still widely used, the growing maturity of
the concept fosters a more sound difference between big data and Business
Intelligence, regarding data and their use:

• Business Intelligence uses descriptive statistics with data with high

information density to measure things, detect trends etc.;
• Big data uses inductive statistics and concepts from nonlinear system
identification to infer laws (regressions, nonlinear relationships, and causal
effects) from large sets of data with low information density to reveal
relationships, dependencies and perform predictions of outcomes and
behaviours.

A more recent, consensual definition states that "Big Data represents the
Information assets characterized by such a High Volume, Velocity and Variety to
require specific Technology and Analytical Methods for its transformation into
Value".
ADVANTAGES AND DISADVANTAGES OF BIG
DATA

ADVANTAGES:

• Our newest research finds that organizations are using big data to target
customer-centric outcomes, tap into internal data and build a better
information ecosystem.

• Big Data is already an important part of the $64 billion database and data
analytics market

• It offers commercial opportunities of a comparable

• scale to enterprise software in the late 1980s. And the Internet boom of the
1990s, and the social media explosion of today.

DISADVANTAGES:

• Will be so overwhelmed

• Need the right people and solve the right problems

• Costs escalate too fast

• Isn’t necessary to capture 100%

• Many sources of are privacy

• Self-regulation

• Legal regulation
APPLICATIONS OF BIG DATA

Big data has increased the demand of information management specialists in that
Software AG, Oracle Corporation, IBM, Microsoft, SAP, EMC, HP and Dell have
spent more than $15 billion on software firms specializing in data management
and analytics. In 2010, this industry was worth more than $100 billion and was
growing at almost 10 percent a year: about twice as fast as the software business as
a whole.

Developed economies make increasing use of data-intensive technologies.

There are 4.6 billion mobile-phone subscriptions worldwide and between 1 billion
and 2 billion people accessing the internet Between 1990 and 2005, more than 1
billion people worldwide entered the middle class which means more and more
people who gain money will become more literate which in turn leads to
information growth. The world's effective capacity to exchange information
through telecommunication networks was 281 petabytes in 1986, 471 petabytes in
1993, 2.2 exabytes in 2000, 65 exabytes in 2007and it is predicted that the amount
of traffic flowing over the internet will reach 667 exabytes annually by 2014. It is
estimated that one third of the globally stored information is in the form of
alphanumeric text and still image data, which is the format most useful for most
big data applications. This also shows the potential of yet unused data (i.e. in the
form of video and audio content).

Government

The use and adoption of Big Data within governmental processes is beneficial and
allows efficiencies in terms of cost, productivity, and innovation. That said, this
process does not come without its flaws. Data analysis often requires multiple
parts of government (central and local) to work in collaboration and create new
and innovative processes to deliver the desired outcome.
Below are the thought leading examples within the Governmental Big Data space.

United States of America

• In 2012, the Obama administration announced the Big Data Research and
Development Initiative, to explore how big data could be used to address
important problems faced by the government. The initiative is composed of
84 different big data programs spread across six departments.
• Big data analysis played a large role in Barack Obama's successful
2012 re-election campaign.
• The United States Federal Government owns six of the ten
most powerful supercomputers in the world.
• The Utah Data Center is a data center currently being constructed by the
United States

India

• Big data analysis was, in parts, responsible for the BJP and its allies to
win a highly successful Indian General Election 2014.
• The Indian Government utilises numerous techniques to ascertain how the
Indian electorate is responding to government action, as well as ideas for
policy augmentation

United Kingdom

Examples of uses of big data in public services:

• Data on prescription drugs: by connecting origin, location and the time of

each prescription, a research unit was able to exemplify the considerable
delay between the release of any given drug, and a UK-wide adaptation of
the National Institute for Health and Care Excellence guidelines. This
suggests that new/most up-to-date drugs take some time to filter through to
the general patient.
• Joining up data: a local authority blended data about services, such as road
gritting rotas, with services for people at risk, such as 'meals on wheels'.
The connection of data allowed the local authority to avoid any weather-
related delay.
International development

Research on the effective usage of information and communication technologies

for development (also known as ICT4D) suggests that big data technology can
make important contributions but also present unique challenges to International
development. Advancements in big data analysis offer cost-effective opportunities
to improve decision-making in critical development areas such as health care,
employment, economic productivity, crime, security, and natural disaster and
resource management. However, longstanding challenges for developing regions
such as inadequate technological infrastructure and economic and human resource
scarcity exacerbate existing concerns with big data such as privacy, imperfect
methodology, and interoperability issues.

Manufacturing

Based on TCS 2013 Global Trend Study, improvements in supply planning and
product quality provide the greatest benefit of big data for manufacturing. Big data
provides an infrastructure for transparency in manufacturing industry, which is the
ability to unravel uncertainties such as inconsistent component performance and
availability. Predictive manufacturing as an applicable approach toward near-zero
downtime and transparency requires vast amount of data and advanced prediction
tools for a systematic process of data into useful information.
Cyber-Physical Models

Current PHM implementations mostly utilize data during the actual usage
while analytical algorithms can perform more accurately when more
information throughout the machine’s lifecycle, such as system
configuration, physical knowledge and working principles, are
included. There is a need to systematically integrate, manage and analyze
machinery or process data during different stages of machine life cycle to handle
data/information more efficiently and further achieve better transparency of
machine health condition for manufacturing industry.

With such motivation a cyber-physical (coupled) model scheme has been

developed. Please see http://www.imscenter.net/cyber-physical-platform The
coupled model is a digital twin of the real machine that operates in the cloud
platform and simulates the health condition with an integrated knowledge from
both data driven analytical algorithms as well as other available physical
knowledge. It can also be described as a 5S systematic approach consisting of
Sensing, Storage, Synchronization, Synthesis and Service. The coupled model first
constructs a digital image from the early design stage. System information and
physical knowledge are logged during product design, based on which a
simulation model is built as a reference for future analysis. Initial parameters may
be statistically generalized and they can be tuned using data from testing or the
manufacturing process using parameter estimation. After which, the simulation
model can be considered as a mirrored image of the real machine, which is able to
continuously record and track machine condition during the later utilization stage.
Finally, with ubiquitous connectivity offered by cloud computing technology, the
coupled model also provides better accessibility of machine condition for factory
managers in cases where physical access to actual equipment or machine data is
limited.

Media

Internet of Things (IoT)

To understand how the media utilises Big Data, it is first necessary to provide
some context into the mechanism used for media process. It has been suggested by
Nick Couldry and Joseph Turow that practitioners in Media and Advertising
approach big data as many actionable points of information about millions of
individuals. The industry appears to be moving away from the traditional
approach of using specific media environments such as newspapers, magazines, or
television shows and instead tap into consumers with technologies that reach
targeted people at optimal times in optimal locations. The ultimate aim is to serve,
or convey, a message or content that is (statistically speaking) in line with the
consumers mindset. For example, publishing environments are increasingly
tailoring messages (advertisements) and content (articles) to appeal to consumers
that have been exclusively gleaned through various data-mining activities.

• Targeting of consumers (for advertising by marketers)

• Data-captur

Big Data and the IoT work in conjunction. From a media perspective, data is the
key derivative of device inter connectivity and allows accurate targeting. The
Internet of Things, with the help of big data, therefore transforms the media
industry, companies and even governments, opening up a new era of economic
growth and competitiveness. The intersection of people, data and intelligent
algorithms have far-reaching impacts on media efficiency. The wealth of data
generated allows an elaborate layer on the present targeting mechanisms of the
industry.
Technology

• eBay.com uses two data warehouses at 7.5 petabytes and 40PB as well
as a 40PB Hadoop cluster for search, consumer recommendations, and
merchandising. Inside eBay’s 90PB data warehouse
• Amazon.com handles millions of back-end operations every day, as
well as queries from more than half a million third-party sellers. The
core technology that keeps Amazon running is Linux-based and as of
2005 they had the world’s three largest Linux databases, with capacities
of 7.8 TB, 18.5 TB, and 24.7 TB.
• Facebook handles 50 billion photos from its user base.
• As of August 2012, Google was handling roughly 100 billion searches
per month.

Private sector

Retail

• Walmart handles more than 1 million customer transactions every

hour, which are imported into databases estimated to contain more
than 2.5 petabytes (2560 terabytes) of data – the equivalent of 167
times the information contained in all the books in the US Library of
Congress.

Retail Banking

• FICO Card Detection System protects accounts world-wide.

• The volume of business data worldwide, across all companies,
doubles every 1.2 years, according to estimates.

Real Estate

• Windermere Real Estate uses anonymous GPS signals from nearly 100
million drivers to help new home buyers determine their typical drive
times to and from work throughout various times of the day.
Science

The Large Hadron Collider experiments represent about 150 million sensors
delivering data 40 million times per second. There are nearly 600 million
collisions per second. After filtering and refraining from recording more than
99.99995% of these streams, there are 100 collisions of interest per second.

• As a result, only working with less than 0.001% of the sensor stream
data, the data flow from all four LHC experiments represents 25
petabytes annual rate before replication (as of 2012). This becomes
nearly 200 petabytes after replication.
• If all sensor data were to be recorded in LHC, the data flow would be
extremely hard to work with. The data flow would exceed 150 million
petabytes annual rate, or nearly 500 exabytes per day, before
replication. To put the number in perspective, this is equivalent to 500
quintillion (5×1020) bytes per day, almost 200 times more than all the
other sources combined in the world.

The Square Kilometers Array is a telescope which consists of millions

of antennas and is expected to be operational by 2024. Collectively,
these antennas are expected to gather 14 exabytes and store one
petabyte per day. It is considered to be one of the most ambitious
scientific projects ever undertaken.

Science and Research

• When the Sloan Digital Sky Survey (SDSS) began collecting

astronomical data in 2000, it amassed more in its first few weeks than
all data collected in the history of astronomy. Continuing at a rate of
about 200 GB per night, SDSS has amassed more than 140 terabytes
of information. When the Large Synoptic Survey Telescope, successor
to SDSS, comes online in 2016 it is anticipated to acquire that amount
of data every five days.
• Decoding the human genome originally took 10 years to process, now it
can be achieved in less than a day: the DNA sequencers have divided
the sequencing cost by 10,000 in the last ten years, which is 100 times
cheaper than the reduction in cost predicted by Moore's Law.
CONCLUSION
The availability of Big Data, low-cost commodity hardware, and new
information management and analytic software have produced a unique
moment in the history of data analysis. The convergence of these trends
means that we have the capabilities required to analyze astonishing data sets
quickly and cost-effectively for the first time in history. These capabilities
are neither theoretical nor trivial. They represent a genuine leap forward and
a clear opportunity to realize enormous gains in terms of efficiency,
productivity, revenue, and profitability.

The Age of Big Data is here, and these are truly revolutionary
times if both business and technology professionals continue to work
together and deliver on the promise.
REFERENCES
1. Adams, M.N.: Perspectives on Data Mining. International Journal of
Market Research 52(1), 11–19 (2020).
2. Asur, S., Huberman, B.A.: Predicting the Future with social media. In:
ACM International Conference on Web Intelligence and Intelligent Agent
Technology, vol. 1, pp. 492–499 (2021).
3. Bakshi, K.: Considerations for Big Data: Architecture and Approaches. In:
Proceedings of the IEEE Aerospace Conference, pp. 1–7 (2019).
4. Cebr: Data equity, Unlocking the value of big data. in: SAS Reports, pp.
1–44 (2022).
5. Cohen, J., Dolan, B., Dunlap, M., Hellerstein, J.M., Welton, C.: MAD
Skills: New Analysis Practices for Big Data. Proceedings of the ACM VLDB
Endowment 2(2), 1481–1492 (2019).

Major Project Report
No ratings yet
Major Project Report
31 pages
Chat GPT With Data Science
100% (1)
Chat GPT With Data Science
36 pages
Internship Report PDF
No ratings yet
Internship Report PDF
45 pages
Model Practical Examination Questions
No ratings yet
Model Practical Examination Questions
7 pages
CET341 Assignment One 2021 - 22
No ratings yet
CET341 Assignment One 2021 - 22
4 pages
Big Data Report
100% (1)
Big Data Report
28 pages
Aditya - Shinde - Internship - Report
No ratings yet
Aditya - Shinde - Internship - Report
38 pages
A Technical Seminar Report On: Department of Computer Science and Engineering
No ratings yet
A Technical Seminar Report On: Department of Computer Science and Engineering
20 pages
Child Monitoring System
No ratings yet
Child Monitoring System
36 pages
Report of Industrial Training
No ratings yet
Report of Industrial Training
22 pages
Big Data
No ratings yet
Big Data
30 pages
Final Internshala Report
No ratings yet
Final Internshala Report
38 pages
8th Sem Project PPT-1
No ratings yet
8th Sem Project PPT-1
26 pages
AN INDUSTRY ORIENTED MINI PROJECT - Docx Edited'
No ratings yet
AN INDUSTRY ORIENTED MINI PROJECT - Docx Edited'
5 pages
Breast Cancer Classification Using Deep Learning Final
No ratings yet
Breast Cancer Classification Using Deep Learning Final
19 pages
Internship Report (Data Science)
No ratings yet
Internship Report (Data Science)
32 pages
Major Project Documentation Final 2
No ratings yet
Major Project Documentation Final 2
62 pages
Major Project Report
No ratings yet
Major Project Report
100 pages
Project Report
No ratings yet
Project Report
67 pages
Final Report Mini Project
No ratings yet
Final Report Mini Project
45 pages
Loan Approval System Based On Machine Learning Approach
100% (1)
Loan Approval System Based On Machine Learning Approach
55 pages
Mca Project Topics
100% (1)
Mca Project Topics
7 pages
Sentimental Analysis of Movie Review
100% (1)
Sentimental Analysis of Movie Review
58 pages
Introduction To Big data-21CS753-syllabus
No ratings yet
Introduction To Big data-21CS753-syllabus
3 pages
BDA-Unit 4
No ratings yet
BDA-Unit 4
61 pages
Anush J Internship Report
No ratings yet
Anush J Internship Report
15 pages
Data Mining-Constraint Based Cluster Analysis
100% (1)
Data Mining-Constraint Based Cluster Analysis
4 pages
Final ML Report
No ratings yet
Final ML Report
34 pages
Mini Project Report: Submitted in Partial Fulfilment of The Requirement For The University of Mumbai For The Degree of by
No ratings yet
Mini Project Report: Submitted in Partial Fulfilment of The Requirement For The University of Mumbai For The Degree of by
24 pages
Phase 1 Project Report
No ratings yet
Phase 1 Project Report
44 pages
Atulkumar Bca 5thsem A35404819038 NTCC Amity University Jharkhand
No ratings yet
Atulkumar Bca 5thsem A35404819038 NTCC Amity University Jharkhand
76 pages
E-Commerce Website
No ratings yet
E-Commerce Website
57 pages
Blockchain: Seminar Report
No ratings yet
Blockchain: Seminar Report
30 pages
CG Mini Project Atom Simulaiton Final Report
No ratings yet
CG Mini Project Atom Simulaiton Final Report
24 pages
Report On Robotics
No ratings yet
Report On Robotics
40 pages
Internship - Report Nithin
No ratings yet
Internship - Report Nithin
25 pages
Phishing Website Detection by Machine Learning Techniques Presentation
No ratings yet
Phishing Website Detection by Machine Learning Techniques Presentation
12 pages
REPORT FILE of FACE MASK DETECTION
No ratings yet
REPORT FILE of FACE MASK DETECTION
45 pages
Machine Learning Based Car Price Prediction System
No ratings yet
Machine Learning Based Car Price Prediction System
32 pages
Final Project Report
No ratings yet
Final Project Report
52 pages
Final Year Project Titles 2024-2025
No ratings yet
Final Year Project Titles 2024-2025
35 pages
Python For Data Science
No ratings yet
Python For Data Science
1 page
Resume Screening Report (1) - Merged
100% (2)
Resume Screening Report (1) - Merged
43 pages
A Project Report On Fake News Detection
100% (1)
A Project Report On Fake News Detection
29 pages
Summer Training Report On Data Science
No ratings yet
Summer Training Report On Data Science
47 pages
Barcode Based Attendance System: Project Report
100% (1)
Barcode Based Attendance System: Project Report
41 pages
Vreportinterm Nsihp
No ratings yet
Vreportinterm Nsihp
28 pages
Tie 045
100% (1)
Tie 045
31 pages
Sample Project Report Ai Based Resume Genera
No ratings yet
Sample Project Report Ai Based Resume Genera
61 pages
Stock-Price-Prediction-Using-Machine-Learning Final Project Indu Mam Project Final Project
No ratings yet
Stock-Price-Prediction-Using-Machine-Learning Final Project Indu Mam Project Final Project
47 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
155 pages
Codsoft Report
No ratings yet
Codsoft Report
26 pages
Final YouTube Automating Comment Analysis
No ratings yet
Final YouTube Automating Comment Analysis
19 pages
Internship Report Core Java
No ratings yet
Internship Report Core Java
46 pages
CC Mini Project Report
No ratings yet
CC Mini Project Report
22 pages
Heart Disease Prediction Using Machine Learning Report
50% (2)
Heart Disease Prediction Using Machine Learning Report
45 pages
Wireless Usb Full Seminar Report Way2project - in
0% (1)
Wireless Usb Full Seminar Report Way2project - in
22 pages
Industrial Training Report
No ratings yet
Industrial Training Report
24 pages
Project Final Report
100% (1)
Project Final Report
44 pages
Report
100% (1)
Report
32 pages
Secure Vault Mobile Application
100% (1)
Secure Vault Mobile Application
62 pages
Big Data Analysis Report
No ratings yet
Big Data Analysis Report
29 pages
Big Data Analytics Using Apache Hadoop
No ratings yet
Big Data Analytics Using Apache Hadoop
33 pages
Visvesvaraya Technological University Belgaum, Karnataka
No ratings yet
Visvesvaraya Technological University Belgaum, Karnataka
4 pages
Coverpage Tech Seminar
No ratings yet
Coverpage Tech Seminar
7 pages
Bhoomika Tech Seminar Report
No ratings yet
Bhoomika Tech Seminar Report
19 pages
Job Description - Fresh Minds - ASE
No ratings yet
Job Description - Fresh Minds - ASE
2 pages
Face Recognition Attendance System
No ratings yet
Face Recognition Attendance System
18 pages
90 Informatic Practices
No ratings yet
90 Informatic Practices
8 pages
Respaldo
No ratings yet
Respaldo
44 pages
Kodingan Tugas
No ratings yet
Kodingan Tugas
3 pages
Cse 2027 Fda M1
No ratings yet
Cse 2027 Fda M1
55 pages
Model Exam
No ratings yet
Model Exam
18 pages
Practical Questions XII 802
No ratings yet
Practical Questions XII 802
5 pages
United States Patent (10) Patent No.: US 6,768,431 B2: Chiang (45) Date of Patent: Jul. 27, 2004
No ratings yet
United States Patent (10) Patent No.: US 6,768,431 B2: Chiang (45) Date of Patent: Jul. 27, 2004
10 pages
Unit - 2: Data Modeling Using The Entity-Relationship (ER) Model
No ratings yet
Unit - 2: Data Modeling Using The Entity-Relationship (ER) Model
64 pages
Payroll Arch Handbk RPT
No ratings yet
Payroll Arch Handbk RPT
33 pages
QB With Answers
No ratings yet
QB With Answers
12 pages
Gis Modules Print
No ratings yet
Gis Modules Print
32 pages
The Database Development Lifecycle
100% (2)
The Database Development Lifecycle
3 pages
Tarefa 5
No ratings yet
Tarefa 5
4 pages
XII IP CH 4 Importing Exporting
No ratings yet
XII IP CH 4 Importing Exporting
14 pages
MSTeams Diagnostics
No ratings yet
MSTeams Diagnostics
270 pages
Python Course - VII Lists and Tuples
No ratings yet
Python Course - VII Lists and Tuples
9 pages
21 Ways To Reduce Your Cloud Data Warehouse Costs
No ratings yet
21 Ways To Reduce Your Cloud Data Warehouse Costs
30 pages
How To Solve The ProgrammingError - Column Does Not Exist Error in Odoo - Ngasturi Notes
No ratings yet
How To Solve The ProgrammingError - Column Does Not Exist Error in Odoo - Ngasturi Notes
4 pages
ZKBio WDMS User Manual - V1.0
No ratings yet
ZKBio WDMS User Manual - V1.0
114 pages
DBMS Practicals
No ratings yet
DBMS Practicals
17 pages
CaseStudy SQL Part3 Allen Joe Winny
No ratings yet
CaseStudy SQL Part3 Allen Joe Winny
6 pages
Untitled
No ratings yet
Untitled
3 pages
Integration Guide For SAP POS by GK - Configuration Guide SAP PI
No ratings yet
Integration Guide For SAP POS by GK - Configuration Guide SAP PI
19 pages
Cara Install Icehrm
No ratings yet
Cara Install Icehrm
2 pages
06 Handout 170
No ratings yet
06 Handout 170
2 pages
Database Management System
No ratings yet
Database Management System
16 pages
Circular-Infosys Spring Board
No ratings yet
Circular-Infosys Spring Board
2 pages

Big Data Seminar Report

Uploaded by

Big Data Seminar Report

Uploaded by

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

A TECHNICAL SEMINOR REPORT

“IMPORTANCE OF BIG DATA ANALYTICS

Under the guidance

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Signature of the Guide Signature of the HOD

I express my sincere gratitude to our principal Dr. PRAVEEN GOWDA,

I wish to place on record my grateful thanks to Mr. VIJAY KUMAR M S,

I hereby like to thanks Mr. PRADEEP B M, Associate professor, Department

4. Advantages and disadvantages 5

5. Application of Big Data 6

Work with big data is necessarily uncommon; most analysis is of "PC

Relational database management systems and desktop statistics and

In 2004, Google published a paper on a process called MapReduce that used

MIKE2.0 is an open approach to information management that

Big Data Analytics for Manufacturing Applications can be based

• Business Intelligence uses descriptive statistics with data with high

• It offers commercial opportunities of a comparable

• Need the right people and solve the right problems

• Costs escalate too fast

• Isn’t necessary to capture 100%

• Many sources of are privacy

Developed economies make increasing use of data-intensive technologies.

United States of America

Examples of uses of big data in public services:

• Data on prescription drugs: by connecting origin, location and the time of

Research on the effective usage of information and communication technologies

With such motivation a cyber-physical (coupled) model scheme has been

Internet of Things (IoT)

• Targeting of consumers (for advertising by marketers)

• Walmart handles more than 1 million customer transactions every

• FICO Card Detection System protects accounts world-wide.

The Square Kilometers Array is a telescope which consists of millions

Science and Research

• When the Sloan Digital Sky Survey (SDSS) began collecting

You might also like