[go: up one dir, main page]

Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
SQL for Data Analytics
SQL for Data Analytics

SQL for Data Analytics: Harness the power of SQL to extract insights from data , Third Edition

Arrow left icon
Profile Icon Jun Shan Profile Icon Matt Goldwasser Profile Icon Upom Malik Profile Icon Benjamin Johnston
Arrow right icon
$35.98 $39.99
Full star icon Full star icon Full star icon Full star icon Half star icon 4.8 (53 Ratings)
eBook Aug 2022 540 pages 3rd Edition
eBook
$35.98 $39.99
Paperback
$49.99
Subscription
Free Trial
Renews at $19.99p/m
Arrow left icon
Profile Icon Jun Shan Profile Icon Matt Goldwasser Profile Icon Upom Malik Profile Icon Benjamin Johnston
Arrow right icon
$35.98 $39.99
Full star icon Full star icon Full star icon Full star icon Half star icon 4.8 (53 Ratings)
eBook Aug 2022 540 pages 3rd Edition
eBook
$35.98 $39.99
Paperback
$49.99
Subscription
Free Trial
Renews at $19.99p/m
eBook
$35.98 $39.99
Paperback
$49.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

SQL for Data Analytics

1. Understanding and Describing Data

Overview

By the end of this chapter, you will be able to explain data and statistics and classify data based on its characteristics. You will find out how to calculate basic univariate statistics of data and identify outliers. You will also learn how to use bivariate analysis to understand the relationship between two variables.

Introduction

Data collection and analysis is an old practice going back to the beginning of civilization. Records from ancient Egyptian papyrus suggest that pharaohs collected census information from villages, possibly to determine the number of soldiers that could be enlisted for the war. However, it was after the arrival of modern computers that the art of data analytics became a significant phenomenon that is changing people's lives every day.

This book, as its name suggests, teaches you how to use Structured Query Language (SQL) for data analytics. SQL is the tool that you will be focusing on in the rest of the book. But before diving into SQL, this chapter will provide an overview of data analytics. You will be introduced to fundamental concepts such as the definition and type of statistics and different methods of statistics, which will lay the foundation for the concepts that future chapters will be based on, define the purpose of the SQL operations that you will learn...

Data Analytics and Statistics

Raw data is a group of values that you can extract from a source. It becomes useful when it is processed to find different patterns in the data that was extracted. These patterns, also referred to as information, help you to interpret the data, make predictions, and identify unexpected changes in the future. This information is then processed into knowledge.

Knowledge is a large, organized collection of persistent and extensive information and experience that can be used to describe and predict phenomena in the real world. Data analysis is the process by which you convert data into information and, thereafter, knowledge. Data analytics is when data analysis is combined with making predictions.

There are several data analysis techniques available to make sense of data. One of them is statistics, which uses mathematical techniques on datasets.

Statistics is the science of collecting and analyzing a large amount of data to identify the characteristics...

Types of Statistics

Statistics can be further divided into two subcategories: descriptive statistics and inferential statistics.

Descriptive statistics are used to describe a collection of data. For example, the average age of people in a country is a descriptive statistics indicator that describes an aspect of the country's residents. Descriptive statistics on a single variable in a dataset are called univariate analysis, while descriptive statistics that look at two or more variables at the same time are called multivariate analysis. In particular, statistics that look at two variables are called bivariate analysis. The average age of a country is an example of univariate analysis, while an analysis examining the interaction between GDP per capita, healthcare spending per capita, and age is multivariate analysis.

In contrast, inferential statistics allows datasets to be collected as a sample or a small portion of measurements from a larger group, called a population....

Working with Missing Data

In all the examples so far, you have been dealing with datasets that are clean and easy to decipher. However, datasets in real world are more complicated than these. One of the many problems you may have to deal with when working with datasets is missing values.

You will further learn the specifics of preparing data in Chapter 3, SQL for Data Preparation. However, in this section, you will learn several strategies that you can use to handle missing data. Some of your strategies include the following:

  • Deleting rows: If a very small number of rows (that is, less than 5% of your dataset) is missing data, then the simplest solution may be to just delete the data points from your set. This would not impact your results too much.
  • Mean/median/mode imputation: If 5% to 25% of your data for a variable is missing, another option is to take the mean, median, or mode of that column and fill in the blanks with that value. It may provide a...

Statistical Significance Testing

Often, an analyst is interested in comparing the statistical properties of two groups, or perhaps just one group before and after a change. Of course, the difference between these two groups may just be due to chance.

An example of where this comes up is in marketing A/B tests. Companies often test two different types of landing pages for a product and measure how many clicks it will receive on each of the landing pages. For example, if you make the image of a product two times larger, will this make people more likely to click it? You may find that 10% of the visitors for variation A of the landing page clicked on the product, and 11% for variation B. So, does that mean variation B is 10% better than A or is this just a result of day-to-day variance? You need a method based on statistics to determine just that.

Statistical significance testing is the method of determining whether the data that you have supports a certain hypothesis. To build...

SQL and Analytics

Throughout this chapter, you have learned about different techniques used in data analytics. All these analytics techniques inevitably lead to the storage and processing of massive data. While there are many tools in today's market that can help you with these tasks, a relational database is the most important one.

A relational database is a convenient and easy-to-understand way to store datasets. Modern relational database management systems, such as PostgreSQL databases, also provide a powerful tool for processing and analyzing data, which is SQL. Using SQL, you can clean data, transform data into more useful formats, and analyze data with statistics to find interesting patterns. The rest of this book will be dedicated to understanding how you can use SQL for these purposes productively and efficiently.

Summary

Data analytics is a powerful method through which you analyze raw data to find patterns and gather predictions that help you to understand the world. The goal of analytics is to turn data into information and knowledge. To accomplish this goal, statistics, or descriptive statistics and statistical significance testing, are used to understand data.

Univariate analysis, a branch of descriptive statistics, can be utilized to understand a single variable of data. It can also be used to find outliers and the distribution of data by utilizing frequency distributions and quantiles. It is useful in finding the central tendency of a variable by calculating the mean, median, and mode of data and the dispersion of data using the range, standard deviation, and IQR.

Bivariate analysis is also used to understand the relationship between datasets. You can determine trends, changes in trends, periodic behavior, and anomalous points regarding two variables by using scatterplots. You can...

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Master each concept through practical exercises and activities
  • Discover various statistical techniques to analyze your data
  • Implement everything you’ve learned on a real-world case study to uncover valuable insights

Description

Every day, businesses operate around the clock, and a huge amount of data is generated at a rapid pace. This book helps you analyze this data and identify key patterns and behaviors that can help you and your business understand your customers at a deep, fundamental level. SQL for Data Analytics, Third Edition is a great way to get started with data analysis, showing how to effectively sort and process information from raw data, even without any prior experience. You will begin by learning how to form hypotheses and generate descriptive statistics that can provide key insights into your existing data. As you progress, you will learn how to write SQL queries to aggregate, calculate, and combine SQL data from sources outside of your current dataset. You will also discover how to work with advanced data types, like JSON. By exploring advanced techniques, such as geospatial analysis and text analysis, you will be able to understand your business at a deeper level. Finally, the book lets you in on the secret to getting information faster and more effectively by using advanced techniques like profiling and automation. By the end of this book, you will be proficient in the efficient application of SQL techniques in everyday business scenarios and looking at data with the critical eye of analytics professional.

Who is this book for?

If you're a database engineer looking to transition into analytics or a backend engineer who wants to develop a deeper understanding of production data and gain practical SQL knowledge, you will find this book useful. This book is also ideal for data scientists or business analysts who want to improve their data analytics skills using SQL. Basic familiarity with SQL (such as basic SELECT, WHERE, and GROUP BY clauses) as well as a good understanding of linear algebra, statistics, and PostgreSQL 14 are necessary to make the most of this SQL data analytics book.

What you will learn

  • Use SQL to clean, prepare, and combine different datasets
  • Aggregate basic statistics using GROUP BY clauses
  • Perform advanced statistical calculations using a WINDOW function
  • Import data into a database to combine with other tables
  • Export SQL query results into various sources
  • Analyze special data types in SQL, including geospatial, date/time, and JSON data
  • Optimize queries and automate tasks
  • Think about data problems and find answers using SQL

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Aug 29, 2022
Length: 540 pages
Edition : 3rd
Language : English
ISBN-13 : 9781801817806
Category :
Concepts :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Aug 29, 2022
Length: 540 pages
Edition : 3rd
Language : English
ISBN-13 : 9781801817806
Category :
Concepts :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 152.97
Mastering Microsoft Power BI – Second Edition
$49.99
Modern Time Series Forecasting with Python
$52.99
SQL for Data Analytics
$49.99
Total $ 152.97 Stars icon

Table of Contents

9 Chapters
1. Understanding and Describing Data Chevron down icon Chevron up icon
2. The Basics of SQL for Analytics Chevron down icon Chevron up icon
3. SQL for Data Preparation Chevron down icon Chevron up icon
4. Aggregate Functions for Data Analysis Chevron down icon Chevron up icon
5. Window Functions for Data Analysis Chevron down icon Chevron up icon
6. Importing and Exporting Data Chevron down icon Chevron up icon
7. Analytics Using Complex Data Types Chevron down icon Chevron up icon
8. Performant SQL Chevron down icon Chevron up icon
9. Using SQL to Uncover the Truth: A Case Study Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.8
(53 Ratings)
5 star 84.9%
4 star 15.1%
3 star 0%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




Jay Potter Sep 20, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
“SQL for Data Analytics” is an excellent book that is beneficial for both a new SQL learner as well as providing helpful guidance for more knowledgeable SQL users. It would work well in the classroom setting, as well as for an individual learning on their own.I appreciate that this text starts before the use of SQL for gaining a broader understanding of data and relevant statistic concepts. Each chapter begins with an overview and ends with a summary, which really helps the reader to understand where they are going next and how the concepts build upon one another. The text does not leave you wondering where you are in the process of learning SQL. It provides relevant examples that build along with the lessons, all the way up to a case study that rounds out all of the knowledge from the earlier lessons.I recommend this title for anyone wanting to learn more about SQL, whether you are in a class with others, or learning on your own, it will be of great service to your learning as you follow along with the concepts and examples.My only caveat about this book is that some of the software versions that they give examples from and instruction on how to use are a few years old, with a few slight variations due to upgrades. They do provide step-by-step instruction on downloading and using these programs for Windows, Mac, and Linux, which is very helpful, but do note that some parts may be slightly different if you have a newer version of a program or operating system.I received this copy of “SQL for Data Analytics” in return for writing this review, this in no way shaped my thoughts on this book. As a former professor and current practitioner in data analysis and SQL I recommend this book regardless.
Amazon Verified review Amazon
Martin Melgar Oct 25, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
An excellent and insightful look into SQL. This book is easily digestible and the authors give very concise and easy to follow instructions on both setting up the PostGres server as well as giving great exercises to follow that one can upload to their own GitHub to show experience. I highly recommend this for anyone who wants to start learning SQL or anyone who wishes to practice their SQL skills more in depth.- Martin
Amazon Verified review Amazon
Sahil Sep 15, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Lots of great material here laid out in an easy to digest format. Especially appreciate the focus of the last chapters on optimizing SQL queries and the case study.
Amazon Verified review Amazon
J.P.G. Apr 23, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
SQL for Data Analytics is a comprehensive and practical guide for anyone interested in becoming a data analyst or improving their SQL skills. The book is structured in a way that allows you to learn SQL concepts and apply them to actual real-world scenarios.I was impressed with the book's approach to data analysis, starting with how to form hypotheses and generate related statistics, and then progressing to advanced statistical calculations and techniques. The author has done a great job of making even the most complex concepts easy to understand, with clear explanations and practical examples.What I appreciated most about this book was the emphasis on hands-on learning. The practical exercises and activities throughout the book help you master each concept and apply what you've learned in a real-world setting. The case study at the end of the book is particularly helpful, as it allows you to use all the techniques you've learned in a single project.The book assumes no prior knowledge of SQL, and it's written in a way that's accessible to beginners as well. Even if you're an experienced SQL user, you'll find techniques that you can apply to your work.Overall, I highly recommend SQL for Data Analytics to anyone interested in data analysis or looking to improve their SQL skills. The book is engaging, informative, and practical, and it's a great addition to your professional toolkit!
Amazon Verified review Amazon
Mrs. Campbell Sep 27, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The book SQL for Data Analytics, Third Edition is a great resource for beginners who have had some basic exposure to SQL and intermediate to high level users alike and covers a wide range of use of this versatile data retrieval and manipulation language. For anyone looking to have a career in data analytics/analysis, SQL is a must-have and in-demand skill so having the right foundational knowledge of the different ways to use it is key.The beginning chapter gives a quick, broad level overview of the data analytics, data analysis, statistics, and working with data sets. Some of the topics could be its own book in and of itself; but the authors did a good job of giving a high level overview and resultant mini-exercises to reinforce the readers’ understanding of them. This chapter does not go into the SQL language just yet but prepares the reader for the types of analysis they will be doing with it.The next chapters focus on the syntax and structure of making SQL queries. Basics such as the SELECT statement, etc are covered in good detail with sample exercises given using a PostgreSQL working environment which they ask you to install on your computer. It also goes over CRUD functions and concepts such as case when, aggregate functions and other helpful statements that most people will use with SQL. The exercises are lined up perfectly after a concept is introduced in order to reinforce them. I appreciated some of the minute details and tips such as how to determine whether every value in a column is unique and other helpful query examples that one might run into in your day to day business needs.After the chapter on Window functions, which is a more intermediate use case, the book goes into concepts that are more advanced in nature and will be appreciated by more veteran analysts such as using python to read/write data in your database using pandas and working with more complex data types such as geospatial, array, and JSON. It also goes over tips and techniques on how to make your SQL more efficient and ends with a case study that the reader can then draw from all the skills learned in the book.The book is very thorough in showing how versatile SQL can be - maybe a little too thorough in some regards. For example, the section on text analytics is very appreciated (I personally was not aware of being able to tokenize text in SQL). However, for most data folks I believe other languages such as R and Python are preferred to conduct text analytics.Overall, I highly recommend this book as it speaks to both data analysts or scientists alike with a wide range of abilities in trying to optimize their understanding of SQL alongside real-world business use cases.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.