[go: up one dir, main page]

0% found this document useful (0 votes)
129 views7 pages

Cloud Databases: A Paradigm Shift in Databases

Uploaded by

simrika thakuri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
129 views7 pages

Cloud Databases: A Paradigm Shift in Databases

Uploaded by

simrika thakuri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

IJCSI International Journal of Computer Science Issues, Vol.

9, Issue 4, No 3, July 2012


ISSN (Online): 1694-0814
www.IJCSI.org 77

Cloud Databases: A Paradigm Shift in Databases


Indu Arora1 and Dr. Anu Gupta2
1
Department of Computer Science and Application,
MCM DAV College for Women, Chandigarh

2
Department of Computer Science and Application,
Panjab University, Chandigarh

Abstract virtualization, N-tier architecture and robust networks. It


Relational databases ruled the Information Technology (IT) delivers highly scalable and expensive infrastructure with
industry for almost 40 years. But last few years have seen sea minimal set up and negligible maintenance cost. It
changes in the way IT is being used and viewed. Stand alone provides IT-related services such as Software-as-a-Service,
applications have been replaced with web-based applications, Development Platforms-as-a-Service and Infrastructure-as-
dedicated servers with multiple distributed servers and dedicated a-Service over the network on-demand anytime from
storage with network storage. Cloud computing has become a
anywhere on the basis of “pay-as-you-go" model. It is a
reality due to its lesser cost, scalability and pay-as-you-go model.
It is one of the biggest changes in IT after the rise of World Wide
fast growing concept changing the IT related perceptions
Web. Cloud databases such as Big Table, Sherpa and SimpleDB of its users. Elasticity, scalability, high availability, price-
are becoming popular. They address the limitations of existing per-usage and multi-tenancy are the main features of
relational databases related to scalability, ease of use and Cloud computing. It reduces the cost of using expensive
dynamic provisioning. Cloud databases are mainly used for data- resources at the provider’s end due to economies of scale.
intensive applications such as data warehousing, data mining and Quick provisioning and immediate deployment of latest
business intelligence. These applications are read-intensive, applications at lesser cost are the benefits which force
scalable and elastic in nature. Transactional data management people to adopt Cloud computing.
applications such as banking, airline reservation, online e-
commerce and supply chain management applications are write-
intensive. Databases supporting such applications require ACID
Cloud computing has brought a paradigm shift not in the
(Atomicity, Consistency, Isolation and Durability) properties, but technology landscape, but also in the database landscape.
these databases are difficult to deploy in the cloud. The goal of With more usage of Cloud computing, demand for
this paper is to review the state of the art in the cloud databases provisioning of database services has raised. Provisioning
and various architectures. It further assesses the challenges to of Cloud databases is known as Database-as-a-Service in
develop cloud databases that meet the user requirements and Cloud terminology. The main objective of the paper is to
discusses popularly used Cloud databases. explore the trends in Cloud databases and analyze the
Keywords: Cloud computing; Cloud Databases; Database potential challenges to develop these databases. The
Architectures. structure of paper has been divided into six sections.
Second section describes Cloud databases. Third provides
an overview of common types of databases. Section 4
1. Introduction discusses major challenges to develop cloud databases.
Fifth summarizes existing cloud databases followed by
Information Technology (IT) department of any
conclusions.
organization is responsible for providing reliable
computing, storage, backup and network facilities at the
lowest feasible cost. Huge investment in IT infrastructure
2. Cloud Databases
works as a hindrance in its adoption especially for small
scale organizations. Cash-strapped organizations look for Massive growth in digital data, changing data storage
alternatives which can reduce their capital investments requirements, better broadband facilities and Cloud
involved in purchasing and maintaining IT hardware and computing led to the emergence of cloud databases [1].
software so that they can get maximum benefits of IT. Cloud Storage, Data as a service (DaaS) and Database as a
Cloud computing (CC) becomes a natural and ideal choice service (DBaaS) are the different terms used for data
for such organizations and customers. Cloud computing management in the Cloud. They differ on the basis of how
takes benefit of many technologies such as server data is stored and managed. Cloud storage is virtual
consolidation, huge and faster storage, grid computing, storage that enables users to store documents and objects.

Copyright (c) 2012 International Journal of Computer Science Issues. All Rights Reserved.
IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 4, No 3, July 2012
ISSN (Online): 1694-0814
www.IJCSI.org 78

Dropbox, iCloud etc. are popular cloud storage services data partitioning. It needs a piece of middleware to route
[2]. DaaS allows user to store data at a remote disk database requests to the appropriate server. As more
available through Internet. It is used mainly for backup servers are added, data has to be repartitioned. Data
purposes and basic data management. Cloud storage partitioning should be done very carefully, otherwise data
cannot work without basic data management services. So, shipping (passing of the information from one machine to
these two terms are used interchangeably. DBaaS is one- the other machine for processing) and joining will become
step ahead. It offers complete database functionality and difficult. More data shipping means more latency and
allows users to access and store their database at remote network bandwidth bottlenecks. These issues reduce
disks anytime from any place through Internet. Amazon’s database performance badly. Shared-nothing Storage
SimpleDB, Amazon RDS, Google’s BigTable, Yahoo’s architecture is also used mainly for data-intensive
Sherpa and Microsoft’s SQL Azure Database are the workloads. IBM and Oracle released their shared-nothing
commonly used databases in the Cloud [3]. implementation of DB2 in 1990 and September 2008
respectively for scalable analytical applications of data
Cloud database is a database delivered to users on demand warehouses. Amazon’s SimpleDB, Hadoop Distributed
through the Internet from a cloud database provider's File System and Yahoo’s PNUTS also implement shared-
servers. Cloud databases provide scalability, high nothing architecture [5-7].
availability, optimized resource allocation and multi-
tenancy. A cloud database can be a traditional database 2.2 Shared-disk Database Architecture
such as MySQL and SQL Server. These databases can be
installed, configured and maintained on a Cloud server by Shared-disk Database Architecture treats the whole
the user himself. This option is popularly called the “Do- database as a single large piece of database stored on a
it-Yourself” approach (DIY). Few providers offer ready- Storage Area Network (SAN) or Network Attached
made database services such as Xeround’s MySQL [4]. In Storage (NAS) storage that is shared and accessible
“Do-it-Yourself” approach, the developers manually through network by all nodes. It requires fewer low-cost
ensure reliability and elasticity service. Selection of a servers. It is easy to virtualize them as each compute
DBaaS solution reduces the complexity and cost of server is identical. It separates the compute from the
running one’s own database. It spares the developer from storage as any number of compute instances may work on
the hassles of tedious management tasks of the database. the entire data. Middleware is not required to route data
Cloud databases provide improved availability, scalability, requests to specific servers as each node/client has access
performance and flexibility at lesser price. Conventional to all of the data. Hence, it is more suitable for On-Line
DBMS (Data Base Management System) deals with Transaction Processing applications. Oracle RAC, IBM
structured data which is held in databases along with its DB2 pureScale, Sybase etc. support this architecture [11].
metadata. While Cloud databases can be used for
unstructured, semi-structured data or structured data. Data Table 1: Comparison of shared-nothing and shared disk storage
architectures
stored in files of various types where the metadata was
either unavailable or incomplete is called unstructured data.
Architecture

Maintenance
Cloud databases are able to support changing storage
Partitioning

Distributed

Scalability

Analytical

Useful for
requirements of Internet-savvy users who deal more with
unstructured data, user created content such as documents
OLTP

Cloud
ACID

Cost

and photos. Shared-nothing and shared-disk are two


widely-used storage architectures in database systems. Shared- Y Y Y N N Y High Y
Nothing
2.1Shared-nothing Storage Architecture Shared- N Y Y Y Y Y Low Y
Disk
Shared-nothing Storage architecture involves data Note: N-No, Y- Yes
partitioning which splits the data into independent sets.
These data sets are physically located on different database
servers. Each server processes and maintains its piece of 3. A Comparative Study of Relational
the database exclusively which makes shared-nothing Databases and NoSQL Databases
databases easily scalable. Due to inherent scalability,
applications designed to work on shared-nothing storage In the earlier stages of computerization, there was more
architecture are suitable for Cloud. But data partitioning demand for transaction processing applications. As the
used in this architecture does not work well with cloud. It database industry matured and people accepted computers
is very difficult to virtualize a shared-nothing database as as part and parcel of their lives, analytical applications
it becomes very complex and difficult to maintain due to became the focus of enterprises. Now they wanted to store

Copyright (c) 2012 International Journal of Computer Science Issues. All Rights Reserved.
IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 4, No 3, July 2012
ISSN (Online): 1694-0814
www.IJCSI.org 79

data not only for transaction processing, but to analyze They have emerged to address the requirements of data
consumer trends and business needs. Enterprises want to management in the cloud as they follow BASE (Basically
use analytical knowledge to enhance their business value. Available, Soft state, eventually consistent) in contrast to
So, enterprise applications are broadly categorized into the ACID guarantees. So, they are not suitable for update-
transactional and analytical applications. Relational intensive transaction applications. They provide high
databases played dominant role in handling transactional availability at the cost of consistency [16-17].
data. Later on, industry leaders like IBM and Oracle added
analytical capabilities to their relational databases for data Table 2: Comparison of RDBMS and NoSQL databases
mining applications. In the mean time, number of RDBMS NoSQL Databases
databases such as Column databases, Object-oriented Data within a database is Each entity is considered an
databases etc. came into market [12-13]. But they could treated as a “whole” independent unit of data and
can be freely moved from
not overpower the relational databases. Then Internet
one machine to the other
revolution and web 2.0 applications started producing RDBMS support centrally They follow distributed
massive sparse and unstructured data. RDBMS are not managed architecture. architecture.
suitable for handling massive sparse data sets with loosely They are statically They are dynamically
defined schemas. The need to store and process such big provisioned. provisioned.
data defined the role of NoSQL databases in the database It is difficult to scale them. They are easily scalable.
technology as Cloud databases. RDBMs and NOSQL They provide SQL to query They use API to query data
databases are briefly discussed as follows: data (not feature rich as SQL).
ACID (Atomicity, Follow BASE (Basically
3.1 Relational Databases Consistency, Isolation and Available, Soft state,
Durability) Compliant; Eventually consistent); The
The concept of relational databases is forty years old. It DBMS maintains user accesses are guaranteed
worked best in the era of hardware limits such as small Consistency. only at a single-key level.
They support on-line They support web2.0
disk space, little memory, slow processor speed and
Transaction Processing applications.
limited networking. It has rigid database architecture based applications.
on tables, columns, indexes, relationships and schema. ORACLE, MySQL, SQL Amazon SimpleDB,
Data is stored in tables with predefined complex Server etc. are popular Yahoo’s PNUTS, CouchDB
relationships. Column indexes are used for faster search. RDBMS. etc. are popular NoSQL
Highly skilled Developers and DBAs are required for Databases.
database design and maintenance. Conventionally, they
are used for transactional databases. They include details
at the lowest granularity. They contain sensitive and 4. Challenges to Develop Cloud Databases
operational data such as employee data and credit card
numbers to handle critical business operations. These Cloud DBMSs should support features of Cloud
databases are not well suited for Cloud environment as computing as well as of traditional databases for wider
they do not support full content data search and are acceptability, which is a Hercules’s task. The potential
difficult to scale beyond a limit [14-15]. challenges associated with cloud databases are as follows:

3.2 NoSQL databases


NoSQL means ‘Not Only SQL’ or ‘Not Relational’. A
NoSQL database is defined as a non-relational, shared-
nothing, horizontally scalable database without ACID
guarantees. NoSQL implementations are classified further
into key/value stores, document stores, object stores, tuple
stores, column stores and graph stores. They can store and
retrieve unstructured, semi-structured and structured data.
They are item-oriented. A domain can be compared to a
table and contains items having different schemas. The
items are identified by keys. All data relevant to a
particular item is stored within that item. It improves
scalability of these databases as complex joins are not
Fig. 1. Possible issues in makeup of cloud databases.
required to regroup data from multiple tables. They have
the ability to replicate and distribute data over many
servers. They are dynamically provisioned on demand.

Copyright (c) 2012 International Journal of Computer Science Issues. All Rights Reserved.
IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 4, No 3, July 2012
ISSN (Online): 1694-0814
www.IJCSI.org 80

4.1 Scalability 4.6 Database Security and Privacy


The main feature of Cloud paradigm is scalability which Data physically stored in a particular country, is subject to
implies that resources can be scaled-up or scaled-down local rules and regulations of that country. The US Patriot
dynamically without causing any interruption in the Act allows the government to demand access to the data
service. It puts challenges on developers to develop stored on any computer. Amazon S3 only allows a
databases in such a way that they can support and handle customer to choose between US and EU data storage
unlimited number of concurrent users and data growth. options. If data is encrypted using a key not located at the
Enterprises deal with huge volumes of data. Adding host, then it is little safer. Risks are involved in storing
additional servers on demand solve the problem of transactional data on an untrusted host. Sensitive data is
scalability, only if the process and workload are encrypted before being uploaded to the cloud to prevent
parallelizable. Scalability requirement of transactional data unauthorized access. Any application running in the cloud
is lesser in comparison to analytical data. should not have the ability to directly decrypt the data
before accessing it. Providing security and privacy to
4.2 High availability and Fault Tolerance different databases on the same hardware is also a big
challenge.
Availability of database implies that database is up and
running 365 X 24 X 7. It becomes necessary to replicate 4.7 Data Portability and Interoperability
data across large geographic distances to provide high data
availability, durability and high levels of fault tolerance. Vendor lock-in is a key obstacle in the adoption of cloud
Amazon’s S3 cloud storage service replicates data across databases. Users want the liberty to move from one vendor
“regions” and “availability zones”. to another without any hassles. It can be avoided through
portable and interoperable components. Data Portability is
4.3Heterogeneous Environment the ability to run components written for one cloud
provider in another cloud provider’s environment.
Users want to access diverse applications from different Interoperability is the ability to write a piece of code that is
locations and devices such as mobiles, tablets, notepads flexible enough to work with multiple cloud providers,
and computers. Since user applications and data regardless of the differences between them. Currently,
(structured or unstructured) vary in nature, it becomes there are no standard API to store and access cloud
difficult to predefine how users will use the system. databases. Legacy applications should be able to work
with cloud databases. Cloud databases should also be able
4.4 Data Consistency and Integrity to interface with business intelligence tools already
available in the market [18-19].
Data integrity is the most critical requirement of all
business applications and is maintained through database
constraints. The lack of data integrity results in unexpected 5. Industry Practices in Cloud Databases
outputs. Cloud databases follow BASE (Basically
Available, Soft state, Eventually consistent) in contrast to Cloud databases are designed for low-cost commodity
the ACID (Atomicity, Consistency, Isolation and hardware. They scale out easily by distributing the
Durability) guarantees. So, Cloud databases support database across multiple hosts/nodes as the load increases.
eventual consistency due to replication of data at multiple NoSQL databases have become synonym for cloud
distributed locations. It becomes difficult to maintain the databases. Few commonly used cloud databases in the
consistency of a transaction in a database which changes industry are described below.
too quickly especially in the case of transactional data.
Developers need to follow BASE approach cautiously. 5.1 Amazon Simple Storage Service (S3) and
They should not compromise data integrity in their over Databases
enthusiasm to move to cloud databases.
Amazon S3 is Internet based storage service. It stores
4.5 Simplified Query Interface objects up to 5GB in size along with 2 KB of Meta data for
each object. Objects are organized by buckets. Each
Cloud Database is distributed. Querying distributed bucket is owned by an AWS (Amazon Web Services)
database is a major challenge that cloud developers face. A account. The buckets are identified by a unique, user-
distributed query has to access multiple nodes of cloud assigned key. Buckets and objects are created, listed and
database. There should be a simplified and standardized retrieved using either a REST or SOAP interface. Amazon
query interface for querying the database. offers MySQL, Oracle and Microsoft SQL Server virtual
instances of databases for deployment in its Amazon

Copyright (c) 2012 International Journal of Computer Science Issues. All Rights Reserved.
IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 4, No 3, July 2012
ISSN (Online): 1694-0814
www.IJCSI.org 81

Elastic Compute Cloud (EC2) cloud. Even third party called GQL () which is not as feature rich as SQL. Select
management providers like Elastra and Rightscale offer statements in GQL can be performed on one table only.
MySQL images. Scaling is not easy with MySQL but it GQL does not support the “Join” statement [23, 24].
can be done. EnterpriseDB’s Postgres Plus Advanced
Server, a transactional database also runs in Amazon’s 5.4 MapReduce
cloud. Earlier Storage was tied to the EC2 instance.
Termination of instance means loss of data associated with It is an easy-to-use programming model that supports
that instance. With Amazon’s Elastic Block Store (EBS), parallel architecture. It is very scalable and works in a
user can choose to allocate storage volumes that persist distributed manner. It is useful for massive data processing,
reliably and independently from EC2 instances. Amazon large scale search and data analysis in the cloud. It
Relational Database Service (RDS) is also a web service provides an abstraction by defining a “mapper” and a
that makes it easy to set up and scale a relational database “reducer”. The “mapper” is applied to every input
in the Cloud. It is designed for developers or businesses key/value pair to generate an arbitrary number of
that require the full features and capabilities of a relational intermediate key/value pairs. The “reducer” is applied to
database. It gives access to the capabilities of a MySQL, all values associated with the same intermediate key to
Oracle or SQL Server database engines running on generate output key/value pairs. It has sufficient
Amazon RDS database instance [20-21]. expression capability to support many real world
algorithms and tasks. It can partition the input data,
5.2 Amazon SimpleDB schedule the execution of program across a set of
machines, handle machine failures and manage the inter-
It is a highly available, scalable and flexible non-relational machine communication. But it cannot be compared to
data store. It works closely with Amazon S3 and Amazon database systems [25].
EC2 to provide the ability to store, process and query data
sets in the cloud. It is NoSQL and name/value pair data 5.5 Hadoop
store. It offers a simple interface of Get, Post, Delete and
Query to run queries on structured data. It is comprised of It is a programming framework for implementing
domains, items, attributes and values. A domain is MapReduce across large grid of servers. It is distributed in
comparable to a table or a worksheet in a spreadsheet e.g. nature and has better scalability than relational and column
employee table. Domains are further comprised of items store databases. It is more suitable for unstructured data. It
(rows) and items are described by attribute-value pairs. is not for mixed workloads, complex data structures and
Unlike a spreadsheet, it allows cells to contain multiple multitasking. Hadoop is a Java based open source project.
values per entry. Each item can have its own unique set of With the support from Yahoo, Hadoop has achieved great
associated attributes(e.g. item “1” might have attributes progress. It has been deployed in a large system with 4,000
“Basic” and “tax” whereas item “2” may have attributes nodes and is used in many large scale data processing
“Basic”, “tax” and “Saving”. It provides scalability by tasks. It enables the addition of Java software Components
allowing user to partition the workload across multiple and provides HDFS (Hadoop Distributed File System) and
domains. Initially, user is allocated a maximum of 250 has been extended to include HBase, a column store
domains. User can choose between consistency and database [26].
eventual consistency. But with complex applications, it is
difficult to maintain data integrity. It allows user to 5.6 Windows Azure Cloud Storage
encrypt data before saving it. It does not decode the data
but query directly on the strings stored. It automatically The aim of Windows Azure Storage is to let users and
manages replication, indexing of data and performance applications access their data efficiently from anywhere at
tuning [22]. any time using simple and familiar programming API.
They can use scalable storage to store any amount of data
5.3 Google App's Bigtable for any length of time on pay per use basis. It supports
structured as well as unstructured data, NoSQL databases
It is a distributed storage system based on GFS (Google and queues. It provides three data abstractions: Blobs,
File system) for structured data. It implements a replicated Tables and Queues. Blobs provide a simple interface for
shared-nothing database. It has been successfully deployed storing named files along with metadata for the file. Tables
in many Google products like Google app engine. It allows provide structured storage. A Table is a set of entities,
a more complex data store than SimpleDB. It allows which contain a set of properties. Queues provide reliable
entities and properties comparable to tables and columns. storage and delivery of messages for an application. All
One can create an entity by creating a python object. The information held in Windows Azure storage is replicated
Google Datastore API also allows a get, put, delete format three times which allows fault tolerance [27].
for accessing data. It also offers a non-SQL language

Copyright (c) 2012 International Journal of Computer Science Issues. All Rights Reserved.
IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 4, No 3, July 2012
ISSN (Online): 1694-0814
www.IJCSI.org 82

5.7 Microsoft SQL Server Data Services (SDDS) created using JavaScript. These views map the document
data onto a table-like structure that can be indexed and
It is a key/value data store, which is also called the cloud queried. It does not support a non-procedural query
extension of Microsoft’s SQL Server. It integrates with language. It achieves scalability through asynchronous
Microsoft’s Sync Framework, which is a .NET library for replication. It has unique capability to serve as a self-
synchronizing dissimilar data sources. It provides schema- contained application server and database [32].
free data storage, SOAP or REST APIs and a pay-as-you-
go payment system. It has three core concepts: Entity, 5.12 MongoDB
Container and Authority. Entity is a property bag of name
and value pairs. Container is a collection of entities. MongoDB is a GPL (General Public License) open source
Authority is collection of containers and acts as a billing document-oriented JSON database system being
unit [28]. developed at 10gen by Geir Magnusson and Dwight
Merriman. It is designed to be a true object database,
5.8 Sherpa rather than a pure key/value store. It stores data in JSON-
like documents with dynamic schemas. It provides the
It was popularly known as PNUTS in earlier publications. speed and scalability of key-value stores and rich
Data is organized into tables of records with attributes. functionality like indexes and dynamic queries of
Tables can be hashed or ordered. It supports blob data type relational databases. It provides horizontal scalability [33].
along with typical data types. It is a simplified relational
data model. It supports selection and projection from a Though NoSQL databases are widely accepted as cloud
single table and avoids join operation. Data is replicated databases in the database landscape, they are not a solution
asynchronously. It can operate in high availability or high for all problems. They can work easily with large sparse data,
consistency mode. Hadoop can use Sherpa as a data store but do not provide transactional integrity, flexible indexing,
instead of the native HDFS [29]. querying and SQL. They are not able to connect with
commonly used Business Intelligence tools. It is difficult to
5.9 Dynamo find experienced NoSQL programmers, developers and
administrators to install and maintain them. So, Cloud
It is a highly available, scalable and distributed key-value databases should be used with full awareness of their
data-store used by Amazon’s core services. It uses limitations.
eventual consistency to achieve high level of availability
i.e. it can write anywhere and update will eventually
propagate to all replicas asynchronously. There is no
6. Conclusions
record structure or indexes in Dynamo. It permits only
Massive data generated by web-based applications have
single key updates. It makes extensive use of object
changed the whole database scenario. Cloud databases
versioning and application-assisted conflict resolution [30].
appear to be a good solution for handling such data.
Moreover, all organizations cannot afford to set up
5.10 MegaStore
expensive data center infrastructure for managing their
It blends the scalability of a NoSQL data-store and the own databases. The growing popularity of Cloud databases
convenience of a traditional RDBMS to meet the storage is marking the beginning of new era of databases. Though
requirements of interactive Internet services such as e-mail, cloud databases are not ACID compliant, they are able to
documents, social networking. It uses synchronous handle massive workloads of web-based applications,
replication to achieve high availability and a consistent which do not require such guarantees. Different Cloud
view of the data. It provides transactional (ACID) databases are available in the market. They share similar
guarantees within an entity group. It is a flexible data concepts and features such as schema free database, simple
model with user-defined schema, full-text indexes and API, eventual/timeline consistency, scalability
queues [31]. synchronous/asynchronous replication etc. But each has its
unique API, query interface, data model and database
5.11 CouchDB functions. These concepts need to be standardized for
their better growth. Cloud computing and Cloud databases
CouchDB is a free, open-source, Apache project since are set to rule the next decade by overcoming the
early 2008. It is a document-oriented database written in limitations they have.
Erlang. It belongs to NoSQL generation of databases.
Documents (i.e. records) are stored in JSON (JavaScript References
Object Notation) format and are accessed through an [1] Rajkumar Buyya et al., “Cloud computing and emerging IT
HTTP interface. It allows "views" to be dynamically platforms: Vision, hype, and reality for delivering

Copyright (c) 2012 International Journal of Computer Science Issues. All Rights Reserved.
IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 4, No 3, July 2012
ISSN (Online): 1694-0814
www.IJCSI.org 83

computing as the 5th utility”, Future Generation Computer [24] F. Chang et al., “Bigtable: A Distributed Storage System for
Systems, Vol. 25, Issue 6, June 2009, pp. 599-616. Structured Data”, in 7th Usenix Symp. Operating Systems
[2] Jiyi Wu et al, “Recent Advances in Cloud Storage”, in Third Design and Implementation (OSDI 06), Usenix Assoc.,
International Symposium on Computer Science and 2006, pp. 205–218.
Computational Technology(ISCSCT ’10), Jiaozuo, P. R. [25] Dawei Jiang et al., “MAP-JOIN-REDUCE: Toward
China, 14-15,August 2010, pp. 151-154. Scalable and Efficient Data Analysis on Large Clusters”,
[3] Database as a Service: Reference Architecture – An IEEE Transactions on Knowledge and Data Engineering,
Overview, An Oracle White Paper on Enterprise Vol. 23, No. 9, 2011.
Architecture September 2011 [26] D. Borthakur, “The Hadoop Distributed File System:
http://www.oracle.com/technetwork/topics/entarch/oes- Architecture and Design, Apache Software Foundation”,
refarch-dbaas-508111.pdf last accessed on May 28, 2012. http://hadoop.apache.org/core/docs/r0.16.4/hdfs_design.htm
[4] http://xeround.com last accessed on May 25, 2012. l last accessed on May 27, 2012.
[5] Daniel J. Abadi, “Data Management in the Cloud: [27] Troy Davis, “Cloud Computing Use Cases and
Limitations and Opportunities”, Bulletin of the IEEE Considerations”, http://digissance.com/ Cloud Computing
Computer Society Technical Committee on Data Talk.pdf last accessed on June 10, 2012
Engineering, 2009, 32(1):3-12. [28] www.windowsazure.com/en-us/develop/net/.../cloud-
[6] http://aws.amazon.com/simpledb/ last accessed on May 23, storage/last accessed on June 10, 2012
2012. [29] Brian Cooper et al., “Building a Cloud for Yahoo”, Bulletin
[7] B.F. Cooper et al., “PNUTS: Yahoo!’s Hosted Data Serving of the IEEE Computer Society Technical Committee on
Platform”, in International Conference on Very Large Data Data Engineering, 2009.
Bases (VLDB), Vol. 1, no. 2, 2008, pp. 1277–1288. [30] Giuseppe DeCandia et al., “Dynamo: Amazon’s Highly
[8] Mike Hogan, “Cloud Computing & Databases”, November Available Key-value Store”, in of 21st ACM Symposium on
14, 2008. Operating System Principles, SOSP 2007, pp 205-220.
[9] Emmanuel Cecchet et al, “Dolly:Virtualization-driven [31] Jason Baker et al., “Megastore: Providing Scalable, Highly
Database Provisioning for the Cloud”, UMass Technical Available Storage for Interactive Services”, in 5th Biennial
Report UM-CS-2010-006. Conference on Innovative Data Systems Research
[10] Daniel J. Abadi, “ColumnStores vs. RowStores: How (CIDR ’11), 2011, pp.223-234.
Different Are They Really?” in International Conference on [32] http://www.couchbase.com/couchdb last accessed on
Management of Data- SIGMOD’08. May31, 2012.
[11] Donald Kossmann, Tim Kraska, Simon Loesing, "An [33] http://www.mongodb.org last accessed on May31, 2012.
Evaluation of Alternative Architectures for Transaction
Processing in the Cloud", SIGMOD’10, June 2010.
[12] Daniel J. Abadi et al., “Column-oriented Database Systems”, Indu Arora obtained her MCA degree from Guru Nanak Dev
VLDB ’09. University in 1992. She has been working as Assistant Professor
[13] Stonebraker, et al., “C-Store: A Column-oriented DBMS”. in Computer Science & Applications at MCMDAV College,
Chandigarh since 1998. She also served at BBKDAV College
[14] Thakur Ramjiram Singh, “Cloud Computing: An Analysis”,
(Aug. 1993- Oct. 1997) and AB College, Pathankot (Aug. 1992 –
International Journal of Enterprise Computing and Business Feb. 1993). She is also pursuing Doctor of Philosophy from
Systems”, Vol. 1, issue 2, July 2011, pp. 2230-8849. Department of Computer Science & Applications from Panjab
[15] Rick Cattell, “Scalable SQL and NoSQL Data Stores”, University, Chandigarh. Her research interests include Internet
ACM SIGMOD, Vol. 39, Issue 4, 2011, pp. 12-27. technologies, databases and Cloud Computing. She has many
[16] Arpita Mathur et al., “Cloud Based Distributed Databases: research papers to her credit.
The Future Ahead”, International Journal on Computer
Science and Engineering (IJCSE) Vol. 3, No. 6, 2011. Dr. Anu Gupta has been working as Assistant Professor in
Computer Science and Applications at Panjab University,
[17] Bo Peng, “Implementation Issues of A Cloud Computing
Chandigarh (India) since July 1998. She held the position of
Platform”, Bulletin of the IEEE Computer Society Chairperson, Department of Computer Science & Applications,
Technical Committee on Data Engineering. Panjab University, Chandigarh from Feb. 2008 to Jan. 2011. She
[18] Mihaela Ion, “Enforcing multi-user access policies to was awarded University medal for securing first position in M.C.A.
encrypted cloud databases”, IEEE International Symposium at Punjabi University, Patiala, Punjab in the year 1997. She has
on Policies for Distributed Systems and Networks, 2011, pp. the experience of working on several platforms using a variety of
175-177. development tools and application packages. She obtained Doctor
[19] Maggiani, R. “Cloud computing is changing how we of Philosophy Degree from Panjab University in the area of
Free/Open Source Software. Her research interests include Cloud
communicate”, IPCC 2009, 2009, pp. 1-4.
Computing, Networking, Multimedia Technologies, E-Commerce
[20] http://aws.amazon.com/rds/S3 last accessed on May 24, and Software Engineering. She is a life-member of ‘Computer
2012. Society of India’ and ‘Indian Academy of Science’. She has
[21] http://aws.amazon.com/rds/ last accessed on May 24, 2012. published several research papers in various journals and
[22] http://aws.amazon.com/simpledb last accessed on May 25, conferences.
2012.
[23] S. Ghemawat et al., “The Google File System”, in
proceeding of 19th ACM Symp. Operating System
Principles (SOSP 03), ACM Press, 2003, pp. 29–43.

Copyright (c) 2012 International Journal of Computer Science Issues. All Rights Reserved.

You might also like