MODULE – 1
INTRODUCTION TO
INFORMATION
STORAGE
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. 1
CONTENTS
▪ Information Storage
▪ Evolution of Storage Architecture
▪ Data Center Infrastructure
▪ Virtualization and Cloud Computing.
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.
Why Information Storage and
Management?
• Information is the knowledge derived from data
• Growth of digital information has resulted in information explosion
• We live in an on-command, on-demand world
4 We need information when and where required
• Increasing dependency on fast and reliable access to information
• Businesses seek to store, protect, optimize, and leverage the information
4 To gain competitive advantage
4 To derive new business opportunity
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. 3
Fig: Virtuous cycle of information
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. 4
What is Data?
Data
It is a collection of raw facts from which conclusions may be drawn.
• Data is converted into more
convenient form − digital data Movie
Digital Movie
10101011010
• Factors for digital data growth Digital Photo
00010101011
01010101010
are: Photo
10101011010
4 Increase in data-processing e-Book
00010101011
01010101010
capabilities Book
10101010101
4 Lower cost of digital storage email
01010101010
Letter 10101010101
4 Affordable and faster
communication technology Digital Data
4 Proliferation of applications
and smart devices
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. 5
email PDF
Types of Data Attachments s
Unstructured
X-ray (90%)
• Data can be classified as: s
Manual Instant
4 Structured s
Image
Messages
Document
4 Unstructured s s
Form
s Web
• Majority of data being Contract
Pages
Rich
created is unstructured s Media
Invoice
s
Audio,
Structured Video
(10%)Databas
e
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 1: Introduction to Information Storage 6
Big Data
Big Data
It refers to data sets whose sizes are beyond the ability of commonly used
software tools to capture, store, manage, and process within acceptable
time limits.
• Includes both structured and unstructured data generated by
variety of sources
• Big data analysis in real time requires new techniques and tools
that provide:
4 High performance
4 Massively parallel processing (MPP) data platforms
4 Advanced analytics
• Big data analytics provide an opportunity to translate large
volumes of data into right decisions
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 1: Introduction to Information Storage 7
Storage
• Stores data created by individuals and organizations
4 Provides access to data for further processing
• Examples of storage devices are:
4 Media card in a cell phone or digital camera
4 DVDs, CD-ROMs
4 Disk drives
4 Disk arrays
4 Tapes
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 1: Introduction to Information Storage 8
Evolution of Storage Architecture
Department 1 Department 2 Department 3
Server Server Server
Department 1 Department 2 Department 3
Server Server Server
Storage
Network
Server-centric Storage Architecture
Storage Device
Information-centric Storage Architecture
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 1: Introduction to Information Storage 9
Data Center
Data Center
It is a facility that contains storage, compute, network, and other IT
resources to provide centralized data-processing capabilities.
• Core elements of a data center
4 Application
4 Database management system (DBMS)
4 Host or Compute
4 Network
4 Storage
• These core elements work together to address data-processing
requirements
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 1: Introduction to Information Storage 10
Data Center: Online Order Transaction System
Example
Host/ Storage Array
Compute
Client
Storage
LAN/WAN Network
User
Interface OS and DBMS
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 1: Introduction to Information Storage 11
Key Characteristics of a Data Center
Availability
Data Integrity Security
Manageability
Performance Capacity
Scalability
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 1: Introduction to Information Storage 12
Managing Data Center
• Key management activities include
4 Monitoring
8 Continuous process of gathering information on various elements
and services running in a data center
4 Reporting
8 Details on resource performance, capacity, and utilization
4 Provisioning
8 Configuration and allocation of resources to meet the capacity,
availability, performance, and security requirements
• Virtualization and cloud computing have changed the way data
center infrastructure resources are provisioned and managed
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 1: Introduction to Information Storage 13
Virtualization: An Overview
• Virtualization is a technique of abstracting physical resources and
making them appear as logical resources
4 For example partitioning of raw disks
• Pools physical resources and provides an aggregated view of
physical resource capabilities
• enables centralized management of pooled resources
• Virtual resources can be created from pooled physical resources
4 Improves utilization of physical IT resources
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 1: Introduction to Information Storage 14
Cloud Computing: An Overview
• Enables individuals and organizations to use IT resources as a service over
network
• Enables self-service requesting and automates request-fulfillment process
4 Enables users to scale up or scale down the usage of computing
resources quickly
• Enables consumption-based metering
4 Consumers pay only for the resources they use
8 Example: CPU hours used, amount of data transferred, and
Gigabytes of data stored
• Cloud infrastructure is built upon virtualized data centers
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 1: Introduction to Information Storage 15
Summary
Key points covered in this chapter:
• Data and information
• Types of data
• Big data
• Evolution of storage architecture
• Core elements of data center
• Key characteristics of data center
• Virtualization and cloud computing
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 1: Introduction to Information Storage 16