[go: up one dir, main page]

0% found this document useful (0 votes)
26 views71 pages

Lecture 7 - Part 2

Uploaded by

김채원
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views71 pages

Lecture 7 - Part 2

Uploaded by

김채원
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

Data Analytics

for Civil & Environmental Engineers

Lecture 7
Introduction to MATLAB for Data Analysis
(Start of Part 2)

1
Part 2

Your research may require the followings

 Define a scientific problem;


 Collect scientific data;
 Apply in-depth data analysis methods;
 Synthesize the analysis results;
 Provide error or uncertainty estimates;
 Present the research results in written and oral forms.

Note: Main lecture contents of Part 2 - from Prof. Wang, Y. 2


3
4
5
Data analysis: Opinions vs. Facts
 “Everyone is entitled to his own opinion but
not his own facts” -Daniel Patrick Moynihan

 Global warming is related to the rise of Dow Jones


Industrial Averages.

 Genetically modified potatoes have 10 times more


Hg than the “normal” potatoes, and therefore unsafe
for human consumption.

 In the past 60 years, the price of potatoes never fell


in all 50 states at the same time. Buying potatoes
across the country is a safe investment.
6
Scientific method

 Formulate hypotheses;

 Observe events and gather data about natural


phenomena;

 Test hypotheses;

Observations usually play a pivotal role in formulating


and testing hypotheses.

7
Observation platforms

 Fixed in situ instruments

 e.g., ground sites (wind speed, pollution


concentrations), stationary buoy

 Mobile in situ instruments


 e.g., cars, ships, aircrafts, balloons, drifting buoy

 Remote sensing measurements

 fixed; mobile; satellites

8
Sampling

 Space

 location; coverage area

 Time

 frequency; period

 Uncertainties

 Precision; accuracy

 Concurrency

 multi-platform/multi-variable
9
Data types

 0-D measurements (1 site)


 Time series measurements
 1-D measurements
 vertical profiles
 multiple species of pollution at 1 site
 2-D measurements
 surface observation network
 3-D measurements
 2-D network with profiles
 satellites
10
Data analysis methods w/ Matlab (this course)

 Univariate methods
 Empirical distributions
 Theoretical distributions (Guassian)
 Statistical tests (e.g., t, F, and χ2)
 Bivariate methods
 Linear correlation
 Regression/prediction, and analysis of residuals
 Bootstrap and Jackknife estimates of regression uncertainties
 Reduced major axis/curvilinear regression
 Principal component regression (EOF and PC analysis)
 Nonlinear regression and weighted regression
11
Data analysis methods w/ Matlab (this course)

 Time-series analysis
 Auto covariance
 Auto spectral/cross spectral analysis
 Uneven-spaced data processing
 Nonlinear time-series analysis
 Signal processing
 “Irregular” temporal variation – signals
 Multivariate statistics
 Principal component analysis

12
Data analysis

 Confirmatory data analysis


 Confirm or falsify existing hypotheses
 Exploratory data analysis
 Suggest hypotheses about the causes of observed
phenomena
 Assess assumptions on which statistical inference will be
based
 Support the selection of appropriate statistical tools and
techniques
 Provide a basis for further data collection through surveys
or experiments
 AI/machine learning
13
Introduction to MATLAB

 Syntax
 Flow control (loops, if)
 Function
 Plots
 Plot, plot3
 Subplot
 Surf, surfc, contour
 Data handling

14
MATLAB interface
 Command window
display results; run scripts; run statements
command prompt >>
 Directory window (set path; Matlab drive)
Open files
 Workspace
 List of variables
 Double click a matrix to open “array editor”
 Editor
 Edit Matlab script
 Run script
 Help (doc command)
 Path (Set path)
15
MATLAB online version

matlab.mathworks.com; drive.matlab.com

16
Programming languages

17
Syntax: assignment

 Scalar
x = 10
x = 9^0.5
x = sqrt(9)

 Special characters

18
Syntax: Vector assignment

 1-D vector; use [ ] for assignment


 Row vector
 Column vector

 Sequence

 Evenly spaced sequence

 Assigning elements

 Operators
u+v, u-v, u.*v, u*v', u.^0.5
19
Syntax: Vector assignment

 Row vector A=[1 2 3]


 Column vector B=A'; B=[1; 2; 3]
 Sequence A=1:10; A=1:0.5:10
 Evenly spaced sequence A= linspace(1,10,9)
 Assigning elements A(2)=10; A(2:2:6)=9:11
 Remove an element A(5)=[]
 u+v, u v, u*v': matrix operations
 . Operator: applies to the elements
u.*v, u.^0.5
20
Syntax: Matrix assignment

 2-D array; table; use [ ] for assignment

 Assign by row: A=[ 1 2 3; 3 4 5; 6 7 8]

 Assign by column

B = [ [1 2 3]' [2 4 7]' [3 5 8]']

 Matrix operator: A*B

 Element operator: A.*B

21
Matrix initialization

 Initialize a m x n matrix of zeros

x=zeros(m,n);

 Initialize a m x n matrix of ones

x=ones(m,n);

 Initialize a m x n matrix with 1's on the diagonal


and 0's elsewhere

x=eye(m,n);

22
Strings: an array of characters

A = [73 32 97 109 32 108 101 97 114 110 105 110 103]


B = [32 77 65 84 76 65 66]
str1=char(A)
str2=char(B)
str=strcat(str1,str2);
disp(str)

23
Useful commands
 who
lists in alphabetical order all variables in the currently active
workspace
 whos
“who” list with along with their sizes and types
 what
list all MATLAB-related files in current directory
 clear
removes all variables from the workspace
 clc
clears all input and output from the Command Window
 clf
clear the figure window
 close all
close all figures
 diary filename
Save the commands you issued and the results you obtain can be
saved to a disk file specified by filename (end by type ‘diary off’) 24
For loops

for i = array (i.e., first : increment : last)


commands
end

Sum up an array
A=1:100;
sm=0;
for i=1:100
sm=sm+A(i);
end
sm
25
While loops

While condition
commands
end

Sum up an array
A=1:100;
sm=0;
i=1;
while i <= max(size (A))
%size (A) returns [1, 100]
%for a vector, one can use length(A)
sm=sm+A(i);
i=i+1; %very important
end
sm

To interrupt, Ctrl-C
26
Script

>> edit first_script_average


Save, run

A=1:100;
sm=0;
for i=1:100
sm=sm+A(i);
end
average=sm/length(A)

27
Golden Section (golden ratio)

 Since antiquity, many philosophers, artists, and


mathematicians have been intrigued by the golden
section, which Renaissance writers called the divine
proportion.
 A golden section is created by the point C on line
segment AB if AC:AB = CB:AC.

28
29
Golden Section (golden ratio)

Write a matlab script to compute the divine


proportion without using sqrt function

30
x(1)=1+1/2
for i=2:20
x(i)=1+1/x(i-1)
end
gsection=(sqrt(5)+1)/2
plot(x(1:10)-gsection)
xlabel('Iteration')
ylabel('Error')

31
32
How do we convert this script by using “while” statement?
What is the advantage?

x(1)=1+1/2
for i=2:20
x(i)=1+1/x(i-1)
end
gsection=(sqrt(5)+1)/2
plot(x(1:10)-gsection)
xlabel('Iteration')
ylabel('Error')

33
While loop
x(1)=1+1/2;
gsection=(sqrt(5)+1)/2;
i=1;
while abs(gsection-x(i))>= 1e-9
x(i+1)=1+1/x(i);
i=i+1;
end

plot(x(1:10)-gsection)
xlabel('Iteration')
ylabel('Error')

What do we do if we don’t know the answer?


34
Iterative calculation
x(1)=1+1/2;
gsection=(sqrt(5)+1)/2;
x(2)=1+1/x(1);
i=2;
%while abs(gsection-x(i))>= 1e-9
while abs(x(i)-x(i-1))>= 1e-9
x(i+1)=1+1/x(i);
i=i+1;
End

plot(x(1:10)-gsection)
xlabel('Iteration')
ylabel('Error')

35
IF selection

36
Score count

score=fix(100*rand(50,1));
pass=0; fail=0;
for i =1:50
if score(i) >=60
pass=pass+1;
else
fail=fail+1;
end
end
How can we simplify the calculation
pass, fail
without using loops?

37
Vector operation

Logic comparison:
true returns 1; false returns 0

score=fix(100*rand(50,1));
pass = (score >=60);
sum(pass)

%find what the scores are.


index=find(score >=60);
score(index)

38
Relation operators

Less than <


Less than or equal to <=
Greater >
Greater or equal to >=
Equal to ==
Not equal to ~=

39
Logic operators

And &

Or |
Not ~

Next question:
How do we find the scores between 60 and 80?

40
Find the scores between 60 and 80

score=fix(100*rand(50,1));
index=find (score >=60 & score <=80);
score(index)
length(index) %length(x)=max(size(x))
disp(strcat('student # =',...
num2str(length(index))))

How do we write “find” statement with “~”


index=find (~(score < 60 | score >80))

41
Estimating π

 What is the probability of hitting the blue target of


a circle with a radius of 1 within a 2x2 square?

42
Estimating π

n=10^6;
x= 2*rand (n, 1)-1;
y= 2*rand (n, 1)-1;
r= sqrt(x.^2+y.^2);
xpi=sum(r<=1)/n*4

43
Function

 Format
function [out1, out2, ...] = funname(input1, input2, ...)
statements
end
 1-Variable
function b=sort(a)
 2-Variable
function [b,j]=sort(a)
 Must use in a script file using the function name
for the file name
44
Function: Descending sort

 Sort a vector, a , in ascending mode


[b,j ]=sort(a)
b: sorted vector
j: indices of vector a that gives b(:) == a(j)

a=round(100*rand(1,5));
[b, j]=sort(a);
b, a(j)

 Write a function that sort vector a in descending


order using function, sort (do not use sort (a,
‘descend’))
45
Function: Descending sort

46
Inline function
advantage: no need to write a script file

 Useful for math expression

f = inline('sqrt(x.^2+y.^2)','x','y')

function a=mysqrt(x,y)
a=sqrt(x.^2+y.^2)
end

47
Rounding to integers

 ceil
 Round towards plus infinity
 floor
 Round towards minus infinity
 fix
 Round towards zero
 round
 Round towards nearest integer

48
Plot

plot(X,Y,S): plots vector


Y versus vector X

S is a character string
made from one element
from any or all the
following 3 columns
[‘color symbol line_style’]

49
Plot

x = [0:0.1:10];
y = sin(x).*x./(1+cos(x)) ;
plot(x,y,'r-')
plot(x,y, 'b-',x,y,'go')
plot(x,y, 'b-',x,y,'go',x,exp(x+1),'m--')
semilogy(x,abs(y), 'b-',x,abs(y),'go',...
x,exp(x+1),'m--')

50
Plot

 Subplot: divides the current figure into


rectangular panes that are numbered row wise.

subplot(m, n, p)
m: rows
n: columns
p: position 1:m*n;
count by rows

51
Subplot

x = [0:0.1:10];
y = sin(x).*x./(1+cos(x));
subplot(2,2,1), plot(x,y,'r-')
subplot(2,2,2), plot(x,y, 'b-',x,y,'go')
subplot(2,2,3), plot(x,y, 'b-',x,y,'go',x,exp(x+1),'m--')
subplot(2,2,4), semilogy(x,abs(y), 'b-',x,abs(y),'go',...
x,exp(x+1),'m--')

52
Subplot

53
Plot

axis([xmin xmax ymin ymax])


or use xlim([xmin,xmax]); ylim([ymin, ymax])
title('string')
xlabel('string')
ylabel('string')
text(x,y,'string')

54
Plot: Two circles

55
Plot

seq=0:pi/100:2*pi;
r1=4, r2=2;
x=r1*cos(seq);
y=r1*sin(seq);
x2=4-r2*cos(seq); Hold on: retains the
current plot and certain
y2=4-r2*sin(seq);
axes properties so that
fill(x,y,'b') subsequent graphing
hold on commands add to the
existing graph.
plot(x2,y2,'r')

56
sprintf: write formatted data to string
s=sprintf(format, variables)
'%g': generic number format
‘FontSize’ to change font

axis([-r1,4+r2,-r1,4+r2],'square')
xlabel('x'),ylabel('y')
title(sprintf('Graphs of x^2+y^2=%g^2 and (4-x)^2+(4-
y)^2=%g^2',r1,r2), 'FontSize',14)
%title('Graphs of x^2+y^2=4^2 and (4-x)^2+(4-y)^2=2^2')
text(-3,5, 'Two Circles')
hold off

57
3D-plot: plot3

Function
x=tcos(t); y=tsin(t); z=t
t in the range of -10 π to 10 π

t = -10*pi:pi/100:10*pi;
x = t.*cos(t);
y = t.*sin(t);
plot3(x,y,t);
title('Curve u(t) = [t*cos(t), t*sin(t), t ]')
grid
58
Loma Prieta Earthquake

 MATLAB default data earthquake.mat contains


200 Hz acceleration data from the October 17,
1989 Loma Prieta earthquake in the Santa
Cruz Mountains. (What is 200 Hz?)

 Get the file


load quake e n v
e: E-W; n: N-S; v: vertical

 Data were scaled with g


e=g*e
59
Loma Prieta Earthquake

 Data processing

load quake e n v
g = 0.0980; %km/s^2
e = g*e; n = g*n; v = g*v;

 The data were sampled at 200 Hz. How do we


set the time sequence for e, n, v?

60
Loma Prieta Earthquake
%time
delt= 1/200;
t = delt*(1:length(e));

%plot
xrange= [0,50];
yrange= [-250 250];
limits = [xrange yrange];
subplot(3,1,1), plot(t,e,'b'), axis(limits), title('East-West acceleration')
subplot(3,1,2), plot(t,n,'g'), axis(limits), title('North-South acceleration')
subplot(3,1,3), plot(t,v,'r'), axis(limits), title('Vertical acceleration')

How do we plot just between 8 and 25?


61
Loma Prieta Earthquake

%can also use xlim(ylim, zlim) function


%syntax: xlim([xmin xmax])
subplot(3,1,1), xlim([8 25])
subplot(3,1,2), xlim([8 25])
subplot(3,1,3), xlim([8 25])
%subplot(3,1,3), plot(t,v,'r'), xlim([8 25]), title('Vertical acceleration')

% Alternative
%trange=[8, 25];
%subplot(3,1,1), plot(t,e,'b'), xlim(trange), title('East-West acceleration’)
%subplot(3,1,2), plot(t,n,'g'), xlim(trange), title('North-South acceleration’)
%subplot(3,1,3), plot(t,v,'r'), xlim(trange), title('Vertical acceleration’)

62
surf: 3-D shaded surface plot

 surf(X,Y,Z)
X, Y, Z are 2-D arrays

 shading flat (constant in each mesh grid)


 shading interp (smooth-looking)

63
View angle

 view(az,el) sets the viewing angle for a three-


dimensional plot. The azimuth, az, is the
horizontal rotation about the z-axis as measured
in degrees from the negative y-axis.

64
Meshgrid

 Function
z=y2-x2

 Create X, Y arrays (meshgrid)


[X,Y] = meshgrid(x,y) transforms the domain
specified by vectors x and y into arrays X and Y

65
Surfc/surf
x = -1:0.05:1;
y = x;
[xi,yi] = meshgrid(x,y);
zi = yi.^2 -xi.^2; %notice . operator
surfc(xi,yi,zi) %with a contour map beneath
%surf(xi,yi,zi)
shading interp
view(120,30)

grid off
view (30, 30)
Or click “rotation” tool in figure menu
66
2-D surface

x = -1:0.05:1;
y = x;
[xi,yi] = meshgrid(x,y);
zi = yi.^2 -xi.^2;
surf(xi, yi, zi) %surf(x,y, zi) also works
shading interp
view(0,90)
colorbar

shading flat

67
Contour

 Contour (x, y, z) [c,h] = contourf(xi, yi, zi,':');


%dotted contour line
 Contourf (x, y, z) to fill
shading flat
color
clabel(c,h)
 Label
title('The level curves of z =
[c,h]=contour(x, y, z) y^2 -x^2.')
xlabel('x')
clabel(c,h)
ylabel('y')
 Add color bar
colorbar colorbar

68
Data handling

u=1:5; v=2:4;
save test.mat u v
clear
load test.mat

save test.txt u v -ascii


clear
load test.txt (Doesn‟t work. Why?)

69
Data handling

SampleID Percent C Percent S


101 0.3657 0.0636
102 0.2208 0.1135
103 0.5353 0.5191
104 0.5009 0.5216
105 0.5415 -999
106 0.501 -999

70
Data handling

a=load('geochem.txt');

%Remove rows with negative numbers

[r, c]=find(a<0)

a(r,:)=[]

71

You might also like