Mashino (oanirg
CA Qdits y
Sub Code: BcS6o2
Textbocki
o MMhine tanirg y Sraeclhas omd vijayakathmi
Roforonce: tõm Mitchel:Machene
Module: Chapte: l,2 (2-1-2.5)
"ItaoduchÝn go Machine tecsnng.
undostondirg data
Chapte : 2(2.6-28,2./), 3(3- 3,34, B.6)
.IN
undotstmding data
Basie koainaig heoy
Module-3:
C
Chaptet: YU2-4"s), 5(S)-S3,S.s-s),6( 6-1,6.2)
N
cinilaity based taaunig
SY
Decision tree
tealrig
Mue-4. chaple: &(8|-84), Io (10.-I05,(0.4-10-11)
U
VT
"Artiial Neuhal metwaKs
MOdule-S: chaptel: 13 (3:|-B6) l4CIu"l-l4.10)
"Reinfokernart lening
0.02-2S Modulo-I
Chapj: asits ef Mathinn lesiring
Duqinia on of losnng
HelboRt Sarmon
otes chngs
keauning corotes Syeterm that enoles a syotem
to do the Sme task mose eierty the met tine.
Dy Mitchall (149) ’ stdasd dofinition.
TCSpOct to ome class @ tasks T omd esfoirne rnasue Pt
.IN
PCiOimne at tasks nT as moasred by P, npCNOS
uhy mxhiN teasnig? C
N
plays a Roe in irpleueng omd ndekcorndtng the %tfconcy
humonn Laaintg
SY
|.02 2025
U
Iris Dataset
VT
"A Datast shcul ot be himal
because it wl Besult ein
cuffosertiteod weh iopropviatt
the hop f foatures
" Foorn datast e wl ectact features to oDK om the
prdicios
wl egult n
mairg (abols
oML a feld cf stuty that givos the compute the abelts to larn
uitet berg cxpuity péoguret ves tnining algouthas
erable them to make oictios , to
clasify the
objt o to uke coroe clecisions.
HoUo do0s ML wOk9
bat sot macheine Rosut
-fcec Leakns ralyre
.Machue leaning systenn (eans tron hstonial dota, bilds the
predicton roodols, Ged cchonovel A ecioVIs mw data, psedicts
the Output fbs it.
-flow chast
MLPloces:
Input Tiainig
Machino Buildirg >Output
leaing
paata agoxithrm mods
.IN
data
*'Refer to blcs
nth
(ast
>TO steisg cata sot to tost +a C
N
Tratning cataset ’ data LRd totain the motol /atoerithm.
SY
’Vaudaton sut o,
VotAying wth au the auictiong avaiable
Peafornme msuTe
U
’ t scoe
mcon acuacy precison fol prrletve
VT
Plectaon
Foates MA
"ML MSes data to dotect vaous patterns tn agivon datasot
& mleasn fon pst data nd npíoue automstoly
"It a data -dlivon techoiay,
" Mochite teaáng ie uch sitniak to data mining as t also dals
wth tho
huo
Applcations MU:
Spocch Roconiton.
Ploduct Rocornmondation
" A
utomobúe " Email sprn md malale
v'stual spessonal Assistrnt Online FAau totectior.
StOtK mket toding Matical omalybs
e-Automatic tomguaqo tsrnsltòn.
2.02
Cupokvisak unsupetvisecl Rounpokcormert Somiaupoaicol
.IN
laainig lekning (ealnig teatning
Cuusting.
Assalatiòn amatiys C
N
pimerson eouctio.
SY
" Data Corn be Of
nomady
U
OSupowied
supeavitod.
VT
"Data con be labelocl clata o nlaboled ata.
Con bo di ffolentiatod mayzedfsom the ctatn
od idertfied.wÈh omdthen
the data atseady
prOert.
name
uSas labolod doto fo the loasnog Poceys.
-Supouised teainuig supavsck.
-Ths type toalning makg wse
Oogarcn.
cocentkates on
(abols uch al
OClasificativn rnairy ra
Cas (tigt astdble).
mot, Finding dabti
ciabatic
potierts.
foh tlasitation
"The algaisns
D Doci's'ontee.
Rordorn forest
9Suppt Nocok hrathi'ne CsVM)
9Naive s t kayes alpim
ANN CAtcat eusal Netwok)
C
nuunbeS
Reqse csionisapiocos ohith psalits aontinúpus aliabtes.
the
can be usad -for prediteng
finding cost of
Pslaocluetp
Consido the t-foowirg exompte qunaock
.IN
uses hitoial dota"
C
N
SY
a (0eeK alatay
The model of foorn y=mxtC (o) Jzab
ohoe ,agb ae calod rase Cson o-ebfiert Tt leos
600m te ata)
U
deperctert Valiabee
4 `calAed the
VT
ordeperclernt vasiabie.
doosnt haue
data.
olosnt use e latold
o Tt
dcjout clusters.
the oojpctC nto
atthibteS.
baue the
.R custerc he dbyets
cust04
gosithnn
"The
(K-mQons
DimonsinBelcion. wy Rhducigfaatuses.
imension oducton tares hhes ounonsioral olata as nput
eutput,
.R Ga tusk e atuing the data set wi'th -fow featurDs
wthout lostng goneaaLiy qf the data.
AsOcktbn Aralye
.IN
else to achipuo
-yrg to maypirg coethirng to somthing ac.
theia
(uo02.2S
C
N
SY
Supovicog.
"R hay a Sgoutcod
clata
-R doosDt ue laboleol
"B usg ele tabelcd data
U
Cabels o
VT
catopiis
. Uhen dotasot bas huge colQctron unlaboll cota nd Sone (abelod
dat. then we
kobolèrg ia a wey cosity pocos.
Somi-supsuiul Lairnig dotasots.
unabelad data y astrnirg psoudo tabely
thun taboled dnd unbbalac cata sct am ke labelool.
the pokson doùg a OORK (Anfoicemet)
the roOasd.
cloes tie Bhey com t
to 20hfch a locsnig catqostbm is thainecd not on plosot data
bt atto Lasel ona Foctltath ystem.
The agonts can Le
be hurren,aminal,olbot ol any rdoperstent
.The thouahde enabe tho
te gart t ga'n expoionce
.IN
The ans t rmirize the oala
n be
C
positive megeta:
N
Chalarges f Machine leakning
SY
phoblem-Ploblene ohse are
" DU- posed
dea.
Data ic heaile.
Ruge
U
data Qet )
VT
Comptexity oAigositms.
Bias- Vaiance
Machire laanig
undetodthelt Undostorrd the
bueineo data
ata
L psocossng
Modatirg.
|MOcdel eUaluation
Moadl cepbyront.
(hap2: Wndostmding Data
httoducton to Data Bg Data
Dota.
"bota ose acts
e enfocm auctio,vido, inoge, Nunbers, Vextaal
" Ncect OA data &to make decícioe.
Dta.
.IN
@watyf docta.
( Value fot the dta. C
N
( Vouurne
SY
teiabytes (TBD, potalaytes (PB) ,o.
U
d ts (neane in dotn
Sed 9e lta
VT
The fast ciuel
vOurme s o t ed s
Vauets: Irrides
" Fos doto
Function done on a t a .
Souee e cata
thuthmes tutabee.
Veiaiay ef data dals oith its
Vauitit of lata icthe cculacy
6vau data. tolsehem
haue come Value
. TA the dala
17.02"2s
Data Soulees.
"A datu SOusce Cam bo
O Segutnsod Dota.
comistlucturocd ata.
nttucturol cat.
Ottusol Dta
O Raoo dota
dlata.
Sata Mathi rsoeençe bata.
(9 Dsdel d ata Spatial cht .
Termpoal dao
.IN
" Rocod data
ou boe
a tabuas fom uohele aD il be entites ,ond cou
attaibutes. C
N
Data Mothx:
ep Recoo data Bct b Data matz h
SY
is sënas to that
cals that is
data Gtosad i murnbeR atteibutes. Cnd the
Loirg Sorol
U
-Al rotioe apoatrÝns (on be
Data
VT
-6ilaph clata blinge gut the golatiòn amog ayects.
the data con be
ohen pojetel
data.
Considesd
-These are 3 4ps
Squonte data tas sormo sauente. Soaence cata doesrnot
houe astime Stommp
data: dota epers to positon Ok areos
gtial
alata
dota with timaly
beunot.
unsuctod Data i
uutrsod data are
Oaud'o.
lota.
So ef osqmizational bata
Totat doto.
Seni-huued atu
"Sorm-stucturod data tornbe JSON XML Obiets, Rss Feaols,
Heikalchal Recclds
.IN
nfo -foh Aeal time updadtal.
C
on oulfaueuit es ike
wsbites,
N
blpgs, nws, etc.
SY
OFlat fles: the dlata
U
cata oho
chmpok y ogomiing
Soveol in plain AscKTT foamt
VT
fokrts aRe
Some pthe popula spieadshact
SopAted Volues.
OTCV-b Sepoatool values ws
tabqáce.
atabale
OTortociiral chtabae. E Baneig, Bocing mtpot Ctc.
OTmolal databae
3) Othos ICrnol
Owww (wosld wido Wob)
MleKAor ble Maltup kouage
boto stream.
J
Dta trOY dam ploo sf cata.
ihomig utgoung ef data.
RCe (Raly imple yrdcatibn)
lotation)
futule
Sprodiitie matyics.
DPresetiptie omadytie.
DrescasptiuD omalytiti npaonca g.
Descubing cmaunfatvs
op cata
.IN
biogmosaic amalugies.
omalbgs. C
N
otect uents
"Ans to-frd the ause Gd
SY
Paditie omauties.
the futre.
U
oe acttom o tho
VT
the bet coulgo
busen mizdioHa.
dota
PAome0ORk.
Data-fndys
FAomeoOKS!
a data
Doto onnectn Rootiny.
th datatSronsice
omoted
ta msltie laye
Phosertabion yr.
14-02.25
Consro Dota Colecton
Gpod daet Chasnctoistics
cata &hou be olovarnt omd et Ctale oR bseleto dote
OTimplinoss
:Tho data shoua Lo oolovant md
abgut he cata
addd
made onto datacot.
O Data ibhatios
dta.
SReathale Systems ike patient insurarno tuta
SOcal omodi data:
.IN
Tuattel FacookK
dalo
Youtuice
Vidoos
giomcta
multinodalaata
C
N
nuraic.
SY
ota that naUse oblems :
U
otata
VT
9hontstendata
acculate ata
Smisig data.
cbta
Vales
’ the tuple
miting : this ic time oSunig4 may
Nalues
Kots of data.
-SA constomt : Wsing unKNQun" ok np'n"
om be wseo others.
Tetual chta
The attribut vaue
the t i b u t e
:adagaetage kircme
yatue n tht
palttn ta fool
Use ctthibute mem foS all th Classos belorng to Cne class.
but not OBG
>Wee the rost rosiete vale to fu in tho ing Vaue.
am be Ostaid
ukousiptabon
tton e mothas
(ast Aep G deasion tree preictÝs.
Nnsy cata:
tochniue cauod.
"Noisy dota Con te barll Usig
Binning
Ss-§1e,14,9,22, 24,26, 28,,34 technhue
.IN
S:TKe the nplote cato cet
S2: Divde it ito equal bng
12,14, 19 22,2426 28,3l,34
C
N
Bin Bn2
SY
nothing bins method:
Tn thes Gothd the bing orsQ the
U
12+14+
VT
Bìn2: 24124,24
2+24+26 =24
Bin 3: 3)3,31.5,3. S 813|+34
SrothinM bin bouxakios:
the bomdaly vaue
Bint: 12,12,19 Bini; 12,19,19
Bn2: 27,22,26 (os) Bin! 22, 26,26
Bin,28,4,34.
Bn3:28,2,34
Daka hrtzgiatiÝo "honsfosmatöng
NOAmati'zation hoccoures wsel aTe:
Dmin-mat prodte tomyong data to the kango.
O2-Soke!
(ag size
a n e ef a t e Snall
Ompatat1o
OMinmaN plocadsYo -
.IN
POnuae!. minr: V-min
maa-mon nw mun.
C
N
mars
V= 88,0,42,94 wy Salrg it coun toGayter timsfomaui on
SY
min-88. mun-m = 88-S8x(1-0) +0
q4-8
U
mo94 mun-ma-O
0+0
VT
Fos 40:
94-88
min-maX=0.33
FOS 92:
mun mo= 92 el8- x-o)+0.
in-mg=0.66,
79-&
-coo mothad osmaization.
(
o= tomdad doottion
Gonsido nasks uke V=f1920, 30. Comvott the masks to
2-ScDße Vaue
oyeee tdornsfotmati on
{-,0,1}.
Standalodoiaton
u lot20+30-= 20
- (p-20) (20- 30)
.IN
2-)
FO&muta0).
|0-20 . 1
C 2
N
20-20.
SY
30-20'= 1
fod 90:
U
Descaiptive Sttiastics.
VT
" t does ata
sumralizaton.
wrdoRstnd oatue ,
YÍualiztiÝn tech holps to
" Dota rnalytisCd dta
Gmalks
Numoicou olota/.
(Catogokial dota
Itolual Ohdirnl
oual bdnal ata lata
dota cata,
Categoíial datu!
Nminad datag
" Nominal data rse Sybals ohich tonnot be påoroskd ike a mumy,
.Tray povdce infotmation but thay haug mo CÁdoirg ep dista.
PatiortTd blod test.
.
90 Medun
gatia
3. positie
.IN
1,
nominal oacta bhdinal
doesn't data.
Otiral datn:
Shoutcl
C
N
"Odiel cata piovdtes enough tinfonatiÝn oosl kas a nstual caqor.
SY
This is cold odiral, cota.
Note:
U
anew value
VT
Nuroical dota:
brtoee
data is amuoie cot fok Och the dfoson co
TIotoual
Ternplatue
omoonig (ntlyal&ata
The diffeence botwcen 0c omd 0
is oncanengfu
opoRatons that tam be applied fo ineluot ceta oso+omd
Roio data
hedifolerno bjuo retiò titoal cata e poution9e o en he
Saje.
- Take and È cONVOS on, th XOOS De bot Sale doeot
utch Hene ttese aY2 cal o sti data ,
Note Data
DVasiabtes
Univaniate biaibte muttivaiate.
Pased on matse
.IN
Alumeliccta
Data visualization)
C
N
SY
ot chalt:
displaythe foaucny cu'selbutonfol voaAlbees
Lisecd to stlte dutecto cota
U
VT
Pie Chalt:
"t thate uhivaiate dato.
20
in
"DOk plots are cimla% to ba, chats
Dot plotc g ae loss Clustolod as Conpased to bah chastc as thy
ry nth sihge polnts.
Conthal
Dol Tondnyi
the Summaay e dota.
.IN
amad halpe n conpaion.
Ways pnchich data on bo summaizeol.
C
N
mode
SY
DMean e dotas
Thole ae thsee Computal fos he oasa,
U
VT
terns
goometic Rromb.
the
6jpomatt marn i 4 . . . n
( MQdom e data
the diceibutÝn.
"modior topresents the mdolle vaue n
-x
f
cf Curnulatiye hoqyeny.
) clas oteeal.
9Mode a data.
ke foequentlyco dot
" oo ds th Valus -that oaous
.IN
"Thesla&e the value tht has
Dgosion.
" Dispoion &the spleacd f et pdata aourd C
the contal tondeny.
N
SY
toasot douetom.
U
SKeN anl
VT
c the ifforenoe betwpern the unnun omd umum t
giuen List.
temdald douahon:
Stndad couation fe the aueoge cuttonce yrom the man the
datast to each p t . /0, 20, 9o
XE IDt20 +30.
N 20.
IAS teN omotioos coviment to Qubduoe th
databet ing
(OdinatOs.
bolow .
co-Ofdinatesbu
" PoAcorales ade abput etata that Ae lesthan the
pokontage p the total valuo.
poceritio rd am be olenoted as
The 25th
.IN
Pomu
C
nOámaly the vaus flig apalt at laalt buy the
N
the thkd
SY
OA patient ut guen by, 12,14 ,19,22, 2)26, 28,3), 34. ot Jge.
U
Ster]:- olution
VT
Step2
Fidatif4 the Seto thd Qatile
oh fo.25
9-{1494,2} S3f96,2s, 31, 3uy
- Find its moediom median 28‘3).
=|4+12-:
2
= JG.5
6-16.5
Fmae to alaulate TOR S
<29S-65
:TQR=13.
HLO (alculato ei-TR.
Fokmulcu
< miinm, ymadon,Q3, maaimum>
.IN
- (2,3,4,7,",9
step2. alculato he onaion
mdi-8.
-Calculate . C
N
SY
=3.
Seepu -Calaulate
U
VT
Seps Fnd min max.
muium 2
. fis-point Surraasyi: <2,, , 0,)3> BDXPlOt
|4
M.
SKewne
"Tuo thingf dafine shape of data ygstosls
"SKaUAess is the mcaule f dhectvon md
Lsmgati (Auise en tatet
Fofrowa to Uate Skeuwhess o`:
Fodra
.IN
kustosis!
"H allo ndicatos the peaks p data ,I, the data is C
N
indiatos hfoa kutosis od vce vessa
SY
FoAmwa
U
VT
Soeual uhiesiate fots,
Stom deak
pLot (Quantie)
makS E studertt
S, co,80, as
usthy stemlag pbt
Starm eat ene tem th one Vaue
H a n i d
(O8- ptot uontik -Quonie ptot)
cisiibt
dintitted
-ronaldibecton
data oh 2datasots.
olralzo
.IN
C
N
SY
U
VT