[go: up one dir, main page]

0% found this document useful (0 votes)
18 views69 pages

Lecture 01

The document discusses various applications of time series analysis and forecasting including transport and stock price prediction. It also covers Internet of Things (IoT) and how devices can be connected to analyze streaming data using techniques like window-based descriptive statistics and identifying seasonal/trend patterns.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views69 pages

Lecture 01

The document discusses various applications of time series analysis and forecasting including transport and stock price prediction. It also covers Internet of Things (IoT) and how devices can be connected to analyze streaming data using techniques like window-based descriptive statistics and identifying seasonal/trend patterns.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

Dr.

Firoz Anwar
Source: https://www.edureka.co/blog/what-is-data-science/
Source: https://www.dataquest.io/blog/what-is-data-science/
Source: https://data-flair.training/blogs/data-science-applications/
§ Time Series Analysis
§ Method of analysing time series data to extract meaningful pattern and
characteristics of the data

§ Time Series Forecasting


Use of Model to predict the future values based on previously observed
data
§ Time Series Analysis
§ Method of analysing time series data to extract meaningful pattern and
characteristics of the data

§ Time Series Forecasting


Use of Model to predict the future values based on previously observed
data
§ Transport projects
§ Sydney Metro Project
§ Analysing/Estimating number of passengers
§ Peak/Off-Peak hour flow
§ Number of services
§ Westconnex Project
§ Analysing/Estimating number of transports
§ Peak/Off-Peak hour flow

§ Stock Price Prediction


§ Estimating Stock Price index
§ How is it different from Traditional Regression Analysis?
§ It
is Time Dependent. So the basic assumption of a linear
regression model that the observations are independent doesn’t
hold in this case.

§ It
has Seasonality Trends, i.e. variations specific to a particular
time frame. For example, if you see the sales of a woollen jacket
over time, you will invariably find higher sales in winter seasons.
§ Trend
A general direction in which
something is developing or changing.

§ Seasonality
Predictable pattern that recurs or
repeats over regular intervals,
typically within a year or less.

§ Irregular fluctuation
§ Variations that occur due to sudden
causes and are unpredictable.

Source: https://towardsdatascience.com/
§ Typical Steps
§ Understanding the Data
§ Hypothesis
§ Feature Extraction
§ Exploratory Data Analysis (EDA)
§ Forecasting with Multiple Model
§ Naive approach, Moving average, Simple exponential smoothing, Holt.s linear
trend model, Auto Regression Integrated Moving Average(ARIMA), SARIMAX,
etc.
§ Model Evaluation
§ Mean Square Error(MSE), Root Mean Squared Error(RMSE) etc.
“Around 25-50 billion devices are
expected to be connected to the
Internet by 2020.” (Mahdavinejad et
al. 2017)

§ Network of Connected Devices


§ Interact with the environment
§ Data accessed
§ Configured/Manipulated

Source: https://www.channelfutures.com/
§ https/json
§ Plain text
§ Binary data
§ XML
§ Proprietary
§ Stream Data
§ Periodical data collection from device
§ API endpoints
§ Window-based descriptive statistics
§ Seasonal pattern
§ Trend pattern
Source: https://www.i-scoop.eu/internet-of-things-guide/
CGM (Continuous Glucose
Monitoring)
§ “Glucose Concen- tration can be
Predicted Ahead in Time From
Continuous Glucose Moni- toring
sensor Time-Series” by Sparacino et
al.
§ Parameter estimation
§ Weighted Linear Regression on
sampling window

Source: “Hands-On Artificial Intelligence for IoT” by Amita Kapoor


Source: Mahdavinejad,
M.S et al. “Machine
learning for Internet of
Things data analysis: A
survey”. Digit.
Commun. Netw. 2017
Source: https://cuppa.uic.edu/academics/upp/upp-programs/certificates/gsav-certificate/
House Price Prediction
§ Regression Model without Geo-data
§ Regression Model with Geo-data

Source: https://towardsdatascience.com/
§ Data Analysis is the heart of Data Science
§ Combination of various analytics/advance analytics shows the bigger
picture.
Introduction to Python Programming
Why Python Programming

I Why Programming?
I Programming is a tool to realise your data analysis ideas
I Data Science relies on programming heavily (why?)
I Why Python Programming?
I Interpreted Programming Language
I Can run interactively (natively interactive including terminal
and IPython)
I Other fancy stuff: Jupyter Notebook, python markdown, . . .
I Easy and flexible syntax
I Powerful third-party package support
I Convenient interface with other languages such as C/C++
The classic Hello, World! program

Python version
print "Hello, world!"

Java version
public class HelloWorldApp {
C++ version public static void main(String [] arg
{
#include <iostream> System.out.println("Hello, World!");
}
int main(){ }
std::cout<<"Hello, World!"<<std::endl;
return 0;
}
Basics of Python Programming
Output and input

The print command


print command is used in Python to display messages
I print "Hello, World" 4
I print Hello, World 4
I print Hello, World 8 Must enclose message with
quotation marks
I print "Hello, World 8 Quotation marks do not match
I Print "Hello, World" 8 print should be lower case
Display on Multiple Lines

print "Hello, World"


print "I love programming"
print "I love Python"

## Hello, World
## I love programming
## I love Python

or

print "Hello, World\nI love programming\nI love Python"

## Hello, World
## I love programming
## I love Python

\n to start a new line


Display Special Characters

print "\"Hello, World\""


print "\ Hello, World\ "

## "Hello, World"
## Hello, World

Display Variable Values

= "Hello"
t = "World"
print s, t

## Hello World
Keyboard input method
Read a string

str = raw_input("Enter a string: ")

Read an integer

x = input("Enter an integer: ")


# read a string and convert it to int
x = raw_input("Enter an integer: ")
x = int(x)

Read a fractional number

x = input("Enter a number: ")


# read a string and convert it to float
x = raw_input("Enter a number: ")
x = float(x)
Variables, data types and operators
Variables

I A variable is a name for data stored ‘in’ the program


I A variable is automatically created by assignment
I No need to define variables (unlike C++ or Java)
variable = expression
I Variable naming rules
I Cannot use keywords (if, else, while, for, . . . )
I Must be one word and cannot contain spaces
I First character must be a (upper or lower-case) letter or ’_’
I Cannot contain any other character than letters, numbers and
’_’
I Case sensitive: student and Student are distinct variables
Data types

I A variable can be used to hold different types of data


I Common data types used in Python
I Whole Numbers (Integer): -5, 1, 3
I Fractional Numbers (Float): 1.5, -0.8
I Strings: “hello”, “Room”, “PYTHON”
I Ordered Data — call by index
I Lists: [‘Australia’, ‘China’, ‘USA’]
I List can contain mixed types: [‘Australia’, ‘China’, 2, 5.8]
I Tuples: (‘Australia’, ‘China’, ‘USA’)
I Tuples are immutable (can’t change once created)
I Dictionaries: {‘name’: ‘Jackson’, ‘Title’: ‘Dr’, ‘Age’: 30}
I Basically a list of key-value pairs
I Unordered Data — call by key

Complex Data types, e.g. lists of lists


Examples of variable assignment

I room=234 4 Assign an integer 234 to the variable named room.


I room= 234 4 Assign a string ‘234’ to the variable named room.
I room=[234,123] 4 Assign a list to the variable named room; the
list contains two integers: 234 and 123.
I 234=room 8 Variable can only be placed on the left side of =
operator.
I 2room=234 8 Illegal variable name
Operators
I Arithmetic Operators (+,-, *, /, %, **)
I Order: ** > *,/,% > +,- , use () to change order — or always use ()!
I Integer division vs float division

a=5/2 # a=2,
b=5.0/2 # b=2.5
I Relational Operators (>,>=,<,<=, ==, !=)
I Used for variable comparison, e.g. numbers and strings
I == (equality) vs = (assignment)

if a==b:
print "a equals b"

I Logical Operators (and, or, not)


I Order: not > and > or — use ()
I Arithmetical > Relational > Logical — use ()
Program structures
Control structures
Sequential

Example: Celsius to Fahrenheit converter

C = input("Enter a
Celsius value: ")

F = 9.0/5*C+32

print C,"Celsius =",F,


"Fahrenheit"
Conditional

Example: grade calculator I

mark = input("Enter
your mark: ")

if mark<50:
print "Fail"
else:
print "Pass"
Example: grade calculator II
mark = input("Enter your mark: ")
if mark<50:
print "Fail"
else:
if mark<65:
print "Pass"
else:
if mark<75:
print "Credit"
else:
if mark<85:
print "D"
else:
print "HD"
Example: grade calculator II (using elif statement)

mark = input("Enter your mark: ")


if mark<50:
print "Fail"
elif mark<65:
print "Pass"
elif mark<75:
print "Credit"
elif mark<85:
print "D"
else:
print "HD"
More about conditional structure
I More complex conditions can be described by using logical operators
(and, or, not)

if age>65 and income<10000:


print "qualify for pensioner discount"

I Correct indention is important in if and all other python code


blocks!

8 4
8
if mark<50: if mark<50:
if mark<50: print "Fail" print "Fail"
print "Fail" else: else:
print "Pass" print "Pass"

Don’t miss out the ‘:’ at the end of if statement


Iterative (loops)

Example: input validation

str = raw_input("Enter a 8-character


string")

# len() returns the size of string,


# list, tuple, ...

while len(str)!=8:
print "Input error"
str = raw_input("Enter a
8-character string")
Example: range-based loop
Loops can be used to go through all items in a list
# sum over all items in the list
# print all items in the list
xlist = [1, 3, 5, 7, 9]
xlist = [1, 3, 5, 7, 9]
sum = 0
for x in xlist:
for x in xlist:
print x,
sum = sum + x
print
print "sum =", sum
## 1 3 5 7 9
## sum = 25
How about searching for a number in the list?
Or finding the maximum/minimum value?
Specifying range

range(stop)
range(start, stop[, step])

I return a list of numbers


(integers only!)
I start: begin value Examples:
(inclusive) of the list, 0 if
# x=[0,1,2,...,9]
omitted x = range(10)
I stop: end value x = range(0,10)
(exclusive) of the list # x=[1,3,5,7,9]
I step (optional): interval x = range(1,10,2)
between two values, 1 by # x=[10,9,...1]
default x = range(10, 0, -1)
I equivalent to Matlab ‘:’ # how about [10,9,...,0]?
operator (=
start:step:stop)
Get help

1. Use help() function: help(cumsum)


2. Use ?cmd or cmd?: ?sum (type in q to quit help page)
3. Use cmd without brackets: range (with very limited text)
4. Use internet: most comprehensive way of getting help
Strings
String operations

# Basic manipulations #Type checks


# create a string
s.isalpha() # letters
s = "Hello world"
s.isdigit() # digits
# Length of string # letters + digits
len(s) # = 11 s.isalnum()
s.isspace() # spaces
# Type conversion s.isupper() # upper-case
# string to number s.islower() # lower-case
s = "123.0"
#Case conversion
s = float(s) # s = 123.0
s = "123" s = "hEllo"
s = int(s) # s = 123 s = s.upper() # "HELLO"
# number to string s = s.lower() # "hello"
s = str(123) # s = "123" s = s.capitalize() #"Hello"
Functions
Functions

I Function is a stored procedure to performs some task


I How to use functions?
I Function definition: define the function
I Function call: use the defined function elsewhere
I Two types of functions in Python
I Built-in (system-defined) functions
I print(), input(), raw_input(), float(), int(), len(). . .
I Must treat built-in function names as reserved words
I User-defined Functions
Elements in functions
Define and use a function
Function definition

def add(a, b):


c = a + b
return c

Use a function

a = input("Enter the first number: ")


b = input("Enter the second number: ")
c = add(a, b) # Function call
print "sum =", c

I keyword def indicates start of function definition


I indentation is used to indicate content of function
I return statement returns value to caller
I multiple values can be returned by tuple e.g. return (a,b)
Why use functions?

# x = [0,1,2,3,4,5,6,7,8,9]
xlist = range(10)
# the following code prints the sum of xlist
sum = 0
for x in xlist:
sum = sum + x
print "sum =", sum
# x = [1,3,5,7,9]
xlist = range(1, 10, 2)
# the following code prints the sum of xlist
sum = 0
for x in xlist:
sum = sum + x
print "sum =", sum

## sum = 45
## sum = 25
Code reuse
# define the sum function
def sumFunc(xlist):
sum = 0
for x in xlist:
sum = sum + x
print "sum =", sum

# x = [0,1,2,3,4,5,6,7,8,9]
xlist = range(10)
# the following code prints the sum of xlist
sumFunc(xlist)
# x = [1,3,5,7,9]
xlist = range(1, 10, 2)
# the following code prints the sum of xlist
sumFunc(xlist)

## sum = 45
## sum = 25
Summary

I A function can be defined once and used everywhere


throughout the program
I Enhance code reusability and maintainability
I Avoid changing one place without updating other parts
I Improve readability and create be�er structured program
I Function design considerations
I What does the function do?
I What input does the function take? (input arguments)
I What result should the function return?
I No return value: e.g. print the result inside function
I Return value required: e.g. return the calculation result to the
caller of the function
More on data types
List and tuples

List
- A list is a collection of values
food = [“chicken”, “beef”, “egg”, “milk”]
- ‘[’ and ’]’ are used to define the list
- items are separated by ’,’s — A list item can be any object — even
another list

Lists behaves like arrays in C++ and Java and follow similar indexing
rules.
List operations
Create a list

x = [ hello , world ] # list of two words


x = [] # an empty list
x = range(5) # x=[0,1,2,3,4]
# x = [0,1,4,9,16]
x = [i**2 for i in range(5)]

Search a list

x = [ hello , world ]
# return the position of "world" in the list
pos = x.index("world") # pos = 1
# raises a valueError if item not found
pos = x.index("work") # pos undefined
Modify a list
Initial: x = [0,1,2]

1. add elements to the end

x = x + [3] # x = [0,1,2,3]
x.append(3) # x = [0,1,2,3]
x = x + [3,4,5] # x = [0,1,2,3,4,5]
x.extend([3,4,5]) # x = [0,1,2,3,4,5]

2. insert element in the middle

x.insert(1,5) # x = [0,5,1,2]

3. add elements in the front

x.insert(0,3) # x = [3,0,1,2]
x = [3] + x # x = [3,0,1,2]
x = [3,4] + x # x = [3,4,0,1,2]
Iteration and List Comprehension
List items can be iterated in a loop

for x in range(5):
print x

## 0
## 1
## 2
## 3
## 4

This is the same as

for x in [0,1,2,3,4]:
print x

## 0
## 1
## 2
## 3
## 4
Return the list index in a loop

Z = ["Hello", "world", "Python"]


for i,x in enumerate(Z):
print i, x

## 0 Hello
## 1 world
## 2 Python
List comprehension o�ers easy and natural ways to construct lists

squares = [x**2 for x in range(5)]


print squares
evens = [ x for x in range(10) if x % 2==0]
print evens

## [0, 1, 4, 9, 16]
## [0, 2, 4, 6, 8]
Tuples

I Tuples are sequences that behave like lists


I Unlike lists, tuples are immutable and can’t be changed
I Tuples are defined by ‘(’ and ‘)’, lists use ‘[’ ’]’
x = (‘Jack’, ‘Smith’, ‘Lecturer’, ‘B’, 1)
Note items can have di�erent data types
I Retrieve items of a tuple
I x[0], x[1], . . . : first, second,. . . items of a tuple
I Tuple Expansion:
(fName, lName, title, level, step) = x
equivalent to 5 separate assigments so that fName = ‘Jack’,
lName = ‘Smith’, and so on.
Mutable vs Immutable Types
Dictionary

Collection is a bunch of values in a singe variable

I List, Tuple: collection of single values in order

x = ["Hello", "World", "Python"] # List

I Dictionary: order-less collection of key-value pairs

Keys must be unique, case sensitive if keys are strings


Create a dictionary

x = {"Hello":1, "World":2, "Python":2}


# the following defines the same dictionary
x = dict()
x["Hello"] = 1
x["World"] = 2
x["Python"] = 2

Modify a dictionary

# update the value for an existing key


x["Hello"] = 2
# add a new key-value pair
x["Programming"] = 5
# delete a key-value pair
del x["World"]
Dictionary example: counting words

wordList = ["hello", "hello", "world", "python", "PYTHON", "Hello"]


# define a list of words
wordDict = dict() # create an empty dictionary
for word in wordList:
word = word.lower() # convert to lower case
# add a new word to dictionary
if word not in wordDict:
wordDict[word] = 1
else: # increment count for old word
wordDict[word] = wordDict[word] + 1
print wordDict

## { python : 2, world : 1, hello : 3}


Files
Open files
Files can be opened with the open() function

fid = open("mytext.txt")

open() returns a file identifier (stored in fid) – a handle for further


file operations
It returns error if file does not exist
File reading

# read all lines into lines variable


lines = fid.read()
# read the next line into line variable
line = fid.readline()

An open file must be closed by fid.close()


File read example

fid = open("mytext.txt")
# print each line of mydata.txt in a loop
for line in fid: fid = open("mytext.txt")
print line # print each line of mydata.txt in a loop
fid.close() for line in fid:
print line.strip()
## First line fid.close()
##
## Second line ## First line
## ## Second line
## Three lines in total ## Three lines in total

Unpleasant extra line break


Write to files
Opening files for writing

fid = open("mydata.txt", "w")

I creates a new file if mydata.txt does not exist


I overwrites the old file if mydata.txt already exists
I use "a" instead of "w" to append to mydata.txt instead of
overwriting

Write to files

fid.write(line)

I need to pay a�ention to "newlines"


I print() prints a new line automatically, write() does not
I may have to use fid.write(line+ \n ) in most cases

You might also like