[go: up one dir, main page]

0% found this document useful (0 votes)
11 views23 pages

ML Module-1 Notes

The document outlines various concepts and modules related to machine learning, including data understanding, decision trees, artificial neural networks, and reinforcement learning. It discusses the importance of data quality, types of data, and the processes involved in machine learning such as training, validation, and model evaluation. Additionally, it highlights applications of machine learning in areas like speech recognition, product recommendation, and anomaly detection.

Uploaded by

shahidhhkhan58
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views23 pages

ML Module-1 Notes

The document outlines various concepts and modules related to machine learning, including data understanding, decision trees, artificial neural networks, and reinforcement learning. It discusses the importance of data quality, types of data, and the processes involved in machine learning such as training, validation, and model evaluation. Additionally, it highlights applications of machine learning in areas like speech recognition, product recommendation, and anomaly detection.

Uploaded by

shahidhhkhan58
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Mashino (oanirg

CA Qdits y
Sub Code: BcS6o2

Textbocki
o MMhine tanirg y Sraeclhas omd vijayakathmi
Roforonce: tõm Mitchel:Machene

Module: Chapte: l,2 (2-1-2.5)

"ItaoduchÝn go Machine tecsnng.


undostondirg data
Chapte : 2(2.6-28,2./), 3(3- 3,34, B.6)

.IN
undotstmding data
Basie koainaig heoy
Module-3:
C
Chaptet: YU2-4"s), 5(S)-S3,S.s-s),6( 6-1,6.2)
N
cinilaity based taaunig
SY

Decision tree
tealrig
Mue-4. chaple: &(8|-84), Io (10.-I05,(0.4-10-11)
U
VT

"Artiial Neuhal metwaKs

MOdule-S: chaptel: 13 (3:|-B6) l4CIu"l-l4.10)

"Reinfokernart lening
0.02-2S Modulo-I

Chapj: asits ef Mathinn lesiring


Duqinia on of losnng
HelboRt Sarmon
otes chngs
keauning corotes Syeterm that enoles a syotem
to do the Sme task mose eierty the met tine.
Dy Mitchall (149) ’ stdasd dofinition.

TCSpOct to ome class @ tasks T omd esfoirne rnasue Pt

.IN
PCiOimne at tasks nT as moasred by P, npCNOS

uhy mxhiN teasnig? C


N
plays a Roe in irpleueng omd ndekcorndtng the %tfconcy
humonn Laaintg
SY

|.02 2025
U

Iris Dataset
VT

"A Datast shcul ot be himal


because it wl Besult ein
cuffosertiteod weh iopropviatt
the hop f foatures
" Foorn datast e wl ectact features to oDK om the
prdicios
wl egult n
mairg (abols

oML a feld cf stuty that givos the compute the abelts to larn
uitet berg cxpuity péoguret ves tnining algouthas
erable them to make oictios , to
clasify the
objt o to uke coroe clecisions.
HoUo do0s ML wOk9

bat sot macheine Rosut


-fcec Leakns ralyre

.Machue leaning systenn (eans tron hstonial dota, bilds the


predicton roodols, Ged cchonovel A ecioVIs mw data, psedicts
the Output fbs it.
-flow chast
MLPloces:

Input Tiainig
Machino Buildirg >Output
leaing
paata agoxithrm mods

.IN
data
*'Refer to blcs
nth
(ast
>TO steisg cata sot to tost +a C
N
Tratning cataset ’ data LRd totain the motol /atoerithm.
SY

’Vaudaton sut o,
VotAying wth au the auictiong avaiable
Peafornme msuTe
U

’ t scoe
mcon acuacy precison fol prrletve
VT

Plectaon

Foates MA

"ML MSes data to dotect vaous patterns tn agivon datasot


& mleasn fon pst data nd npíoue automstoly
"It a data -dlivon techoiay,
" Mochite teaáng ie uch sitniak to data mining as t also dals
wth tho
huo
Applcations MU:
Spocch Roconiton.
Ploduct Rocornmondation

" A
utomobúe " Email sprn md malale
v'stual spessonal Assistrnt Online FAau totectior.
StOtK mket toding Matical omalybs
e-Automatic tomguaqo tsrnsltòn.
2.02

Cupokvisak unsupetvisecl Rounpokcormert Somiaupoaicol

.IN
laainig lekning (ealnig teatning
Cuusting.
Assalatiòn amatiys C
N
pimerson eouctio.
SY

" Data Corn be Of


nomady
U

OSupowied
supeavitod.
VT

"Data con be labelocl clata o nlaboled ata.


Con bo di ffolentiatod mayzedfsom the ctatn
od idertfied.wÈh omdthen
the data atseady
prOert.
name

uSas labolod doto fo the loasnog Poceys.


-Supouised teainuig supavsck.
-Ths type toalning makg wse

Oogarcn.
cocentkates on
(abols uch al
OClasificativn rnairy ra
Cas (tigt astdble).
mot, Finding dabti
ciabatic
potierts.
foh tlasitation
"The algaisns
D Doci's'ontee.
Rordorn forest
9Suppt Nocok hrathi'ne CsVM)
9Naive s t kayes alpim
ANN CAtcat eusal Netwok)
C
nuunbeS

Reqse csionisapiocos ohith psalits aontinúpus aliabtes.


the
can be usad -for prediteng
finding cost of
Pslaocluetp
Consido the t-foowirg exompte qunaock

.IN
uses hitoial dota"

C
N
SY

a (0eeK alatay
The model of foorn y=mxtC (o) Jzab
ohoe ,agb ae calod rase Cson o-ebfiert Tt leos
600m te ata)
U

deperctert Valiabee
4 `calAed the
VT

ordeperclernt vasiabie.
doosnt haue

data.
olosnt use e latold
o Tt

dcjout clusters.
the oojpctC nto
atthibteS.
baue the
.R custerc he dbyets
cust04

gosithnn
"The
(K-mQons

DimonsinBelcion. wy Rhducigfaatuses.
imension oducton tares hhes ounonsioral olata as nput
eutput,
.R Ga tusk e atuing the data set wi'th -fow featurDs
wthout lostng goneaaLiy qf the data.

AsOcktbn Aralye

.IN
else to achipuo
-yrg to maypirg coethirng to somthing ac.
theia
(uo02.2S
C
N
SY

Supovicog.
"R hay a Sgoutcod
clata
-R doosDt ue laboleol
"B usg ele tabelcd data
U

Cabels o
VT

catopiis

. Uhen dotasot bas huge colQctron unlaboll cota nd Sone (abelod


dat. then we
kobolèrg ia a wey cosity pocos.
Somi-supsuiul Lairnig dotasots.
unabelad data y astrnirg psoudo tabely
thun taboled dnd unbbalac cata sct am ke labelool.
the pokson doùg a OORK (Anfoicemet)
the roOasd.
cloes tie Bhey com t

to 20hfch a locsnig catqostbm is thainecd not on plosot data


bt atto Lasel ona Foctltath ystem.

The agonts can Le


be hurren,aminal,olbot ol any rdoperstent

.The thouahde enabe tho


te gart t ga'n expoionce

.IN
The ans t rmirize the oala
n be

C
positive megeta:
N
Chalarges f Machine leakning
SY

phoblem-Ploblene ohse are


" DU- posed
dea.
Data ic heaile.
Ruge
U

data Qet )
VT

Comptexity oAigositms.
Bias- Vaiance
Machire laanig
undetodthelt Undostorrd the
bueineo data

ata
L psocossng

Modatirg.
|MOcdel eUaluation

Moadl cepbyront.
(hap2: Wndostmding Data
httoducton to Data Bg Data
Dota.
"bota ose acts
e enfocm auctio,vido, inoge, Nunbers, Vextaal
" Ncect OA data &to make decícioe.

Dta.

.IN
@watyf docta.
( Value fot the dta. C
N
( Vouurne
SY

teiabytes (TBD, potalaytes (PB) ,o.


U

d ts (neane in dotn
Sed 9e lta
VT

The fast ciuel


vOurme s o t ed s

Vauets: Irrides
" Fos doto
Function done on a t a .
Souee e cata

thuthmes tutabee.
Veiaiay ef data dals oith its

Vauitit of lata icthe cculacy


6vau data. tolsehem
haue come Value
. TA the dala
17.02"2s

Data Soulees.

"A datu SOusce Cam bo


O Segutnsod Dota.
comistlucturocd ata.
nttucturol cat.
Ottusol Dta

O Raoo dota
dlata.
Sata Mathi rsoeençe bata.
(9 Dsdel d ata Spatial cht .
Termpoal dao

.IN
" Rocod data
ou boe
a tabuas fom uohele aD il be entites ,ond cou
attaibutes. C
N
Data Mothx:
ep Recoo data Bct b Data matz h
SY

is sënas to that
cals that is
data Gtosad i murnbeR atteibutes. Cnd the
Loirg Sorol
U

-Al rotioe apoatrÝns (on be


Data
VT

-6ilaph clata blinge gut the golatiòn amog ayects.


the data con be

ohen pojetel
data.
Considesd

-These are 3 4ps


Squonte data tas sormo sauente. Soaence cata doesrnot
houe astime Stommp
data: dota epers to positon Ok areos
gtial
alata
dota with timaly
beunot.
unsuctod Data i
uutrsod data are
Oaud'o.

lota.

So ef osqmizational bata
Totat doto.
Seni-huued atu
"Sorm-stucturod data tornbe JSON XML Obiets, Rss Feaols,
Heikalchal Recclds

.IN
nfo -foh Aeal time updadtal.
C
on oulfaueuit es ike
wsbites,
N
blpgs, nws, etc.
SY

OFlat fles: the dlata


U

cata oho
chmpok y ogomiing
Soveol in plain AscKTT foamt
VT

fokrts aRe
Some pthe popula spieadshact
SopAted Volues.
OTCV-b Sepoatool values ws
tabqáce.
atabale
OTortociiral chtabae. E Baneig, Bocing mtpot Ctc.
OTmolal databae

3) Othos ICrnol
Owww (wosld wido Wob)
MleKAor ble Maltup kouage
boto stream.
J
Dta trOY dam ploo sf cata.
ihomig utgoung ef data.
RCe (Raly imple yrdcatibn)
lotation)

futule

Sprodiitie matyics.
DPresetiptie omadytie.
DrescasptiuD omalytiti npaonca g.

Descubing cmaunfatvs
op cata

.IN
biogmosaic amalugies.

omalbgs. C
N
otect uents
"Ans to-frd the ause Gd
SY

Paditie omauties.
the futre.
U

oe acttom o tho
VT

the bet coulgo

busen mizdioHa.

dota
PAome0ORk.
Data-fndys
FAomeoOKS!
a data
Doto onnectn Rootiny.
th datatSronsice
omoted
ta msltie laye
Phosertabion yr.
14-02.25

Consro Dota Colecton

Gpod daet Chasnctoistics


cata &hou be olovarnt omd et Ctale oR bseleto dote
OTimplinoss
:Tho data shoua Lo oolovant md

abgut he cata
addd
made onto datacot.

O Data ibhatios
dta.
SReathale Systems ike patient insurarno tuta
SOcal omodi data:

.IN
Tuattel FacookK
dalo
Youtuice
Vidoos
giomcta
multinodalaata

C
N
nuraic.
SY

ota that naUse oblems :


U

otata
VT

9hontstendata

acculate ata
Smisig data.
cbta

Vales
’ the tuple
miting : this ic time oSunig4 may
Nalues
Kots of data.

-SA constomt : Wsing unKNQun" ok np'n"


om be wseo others.
Tetual chta
The attribut vaue
the t i b u t e
:adagaetage kircme
yatue n tht
palttn ta fool
Use ctthibute mem foS all th Classos belorng to Cne class.

but not OBG


>Wee the rost rosiete vale to fu in tho ing Vaue.
am be Ostaid
ukousiptabon
tton e mothas
(ast Aep G deasion tree preictÝs.

Nnsy cata:
tochniue cauod.
"Noisy dota Con te barll Usig
Binning
Ss-§1e,14,9,22, 24,26, 28,,34 technhue

.IN
S:TKe the nplote cato cet
S2: Divde it ito equal bng
12,14, 19 22,2426 28,3l,34
C
N
Bin Bn2
SY

nothing bins method:


Tn thes Gothd the bing orsQ the
U

12+14+
VT

Bìn2: 24124,24
2+24+26 =24

Bin 3: 3)3,31.5,3. S 813|+34


SrothinM bin bouxakios:

the bomdaly vaue


Bint: 12,12,19 Bini; 12,19,19
Bn2: 27,22,26 (os) Bin! 22, 26,26
Bin,28,4,34.
Bn3:28,2,34
Daka hrtzgiatiÝo "honsfosmatöng
NOAmati'zation hoccoures wsel aTe:
Dmin-mat prodte tomyong data to the kango.
O2-Soke!

(ag size

a n e ef a t e Snall
Ompatat1o

OMinmaN plocadsYo -

.IN
POnuae!. minr: V-min
maa-mon nw mun.

C
N
mars
V= 88,0,42,94 wy Salrg it coun toGayter timsfomaui on
SY

min-88. mun-m = 88-S8x(1-0) +0


q4-8
U

mo94 mun-ma-O
0+0
VT

Fos 40:

94-88
min-maX=0.33

FOS 92:
mun mo= 92 el8- x-o)+0.
in-mg=0.66,

79-&
-coo mothad osmaization.
(

o= tomdad doottion

Gonsido nasks uke V=f1920, 30. Comvott the masks to


2-ScDße Vaue
oyeee tdornsfotmati on
{-,0,1}.
Standalodoiaton

u lot20+30-= 20
- (p-20) (20- 30)

.IN
2-)

FO&muta0).
|0-20 . 1
C 2
N
20-20.
SY

30-20'= 1
fod 90:
U

Descaiptive Sttiastics.
VT

" t does ata


sumralizaton.
wrdoRstnd oatue ,
YÍualiztiÝn tech holps to
" Dota rnalytisCd dta

Gmalks

Numoicou olota/.
(Catogokial dota
Itolual Ohdirnl
oual bdnal ata lata
dota cata,
Categoíial datu!
Nminad datag
" Nominal data rse Sybals ohich tonnot be påoroskd ike a mumy,
.Tray povdce infotmation but thay haug mo CÁdoirg ep dista.

PatiortTd blod test.


.

90 Medun
gatia
3. positie

.IN
1,
nominal oacta bhdinal
doesn't data.

Otiral datn:
Shoutcl
C
N
"Odiel cata piovdtes enough tinfonatiÝn oosl kas a nstual caqor.
SY

This is cold odiral, cota.


Note:
U

anew value
VT

Nuroical dota:

brtoee
data is amuoie cot fok Och the dfoson co
TIotoual
Ternplatue
omoonig (ntlyal&ata
The diffeence botwcen 0c omd 0
is oncanengfu
opoRatons that tam be applied fo ineluot ceta oso+omd
Roio data

hedifolerno bjuo retiò titoal cata e poution9e o en he


Saje.
- Take and È cONVOS on, th XOOS De bot Sale doeot
utch Hene ttese aY2 cal o sti data ,

Note Data
DVasiabtes

Univaniate biaibte muttivaiate.

Pased on matse

.IN
Alumeliccta

Data visualization)
C
N
SY

ot chalt:
displaythe foaucny cu'selbutonfol voaAlbees
Lisecd to stlte dutecto cota
U
VT

Pie Chalt:
"t thate uhivaiate dato.

20
in

"DOk plots are cimla% to ba, chats


Dot plotc g ae loss Clustolod as Conpased to bah chastc as thy
ry nth sihge polnts.

Conthal
Dol Tondnyi
the Summaay e dota.

.IN
amad halpe n conpaion.
Ways pnchich data on bo summaizeol.
C
N
mode
SY

DMean e dotas
Thole ae thsee Computal fos he oasa,
U
VT

terns

goometic Rromb.
the
6jpomatt marn i 4 . . . n
( MQdom e data
the diceibutÝn.
"modior topresents the mdolle vaue n
-x
f

cf Curnulatiye hoqyeny.

) clas oteeal.

9Mode a data.
ke foequentlyco dot
" oo ds th Valus -that oaous

.IN
"Thesla&e the value tht has

Dgosion.
" Dispoion &the spleacd f et pdata aourd C
the contal tondeny.
N
SY

toasot douetom.
U

SKeN anl
VT

c the ifforenoe betwpern the unnun omd umum t


giuen List.
temdald douahon:
Stndad couation fe the aueoge cuttonce yrom the man the
datast to each p t . /0, 20, 9o
XE IDt20 +30.

N 20.
IAS teN omotioos coviment to Qubduoe th
databet ing
(OdinatOs.

bolow .
co-Ofdinatesbu
" PoAcorales ade abput etata that Ae lesthan the
pokontage p the total valuo.
poceritio rd am be olenoted as
The 25th

.IN
Pomu

C
nOámaly the vaus flig apalt at laalt buy the
N
the thkd
SY

OA patient ut guen by, 12,14 ,19,22, 2)26, 28,3), 34. ot Jge.


U

Ster]:- olution
VT

Step2
Fidatif4 the Seto thd Qatile
oh fo.25
9-{1494,2} S3f96,2s, 31, 3uy
- Find its moediom median 28‘3).
=|4+12-:
2
= JG.5

6-16.5
Fmae to alaulate TOR S

<29S-65
:TQR=13.
HLO (alculato ei-TR.
Fokmulcu

< miinm, ymadon,Q3, maaimum>

.IN
- (2,3,4,7,",9
step2. alculato he onaion
mdi-8.
-Calculate . C
N
SY

=3.

Seepu -Calaulate
U
VT

Seps Fnd min max.


muium 2

. fis-point Surraasyi: <2,, , 0,)3> BDXPlOt


|4

M.
SKewne
"Tuo thingf dafine shape of data ygstosls
"SKaUAess is the mcaule f dhectvon md

Lsmgati (Auise en tatet


Fofrowa to Uate Skeuwhess o`:
Fodra

.IN
kustosis!
"H allo ndicatos the peaks p data ,I, the data is C
N
indiatos hfoa kutosis od vce vessa
SY

FoAmwa
U
VT

Soeual uhiesiate fots,

Stom deak
pLot (Quantie)

makS E studertt
S, co,80, as
usthy stemlag pbt
Starm eat ene tem th one Vaue

H a n i d
(O8- ptot uontik -Quonie ptot)
cisiibt
dintitted
-ronaldibecton

data oh 2datasots.

olralzo

.IN
C
N
SY
U
VT

You might also like