Statistics
In
Modern Science & Technology
Buddhananda Bnaerjee
Department of Mathematics
Centre for Excellence in AI
IIT Kharagpur
1
Socratic questioning
Socratic questioning was named
after Socrates, who was a philosopher
in Greece 470 BCE-399 BCE.
Socrates utilised an educational
method that focused on discovering
answers by asking questions from his
students.
2
Is the world / the behaviour of the Nature
QUALITATIVE
or
QUANTATIVE ?
3
Is the world / the behaviour of the Nature
QUALITATIVE
or
QUANTATIVE ?
Can we manipulate it upto a certain extent?
If yes, then how?
4
Mathematics
Understanding patterns in nature
Use of logic to evolve it with some
assumptions/ postulates /axioms
5
Mathematics
Understanding patterns in the nature
Use of logic to evolve it with some
assumptions/ postulates /axioms
Statistics
Assumption of uncertainty / randomness
Considering repetitive behaviour /
law of large numbers (LLN)
6
Does anything happen at
random?
NO : Under Newtonian physics
YES : Under Quantum physics
7
Does anything happen at
random?
NO : Under Newtonian physics
YES : Under Quantum physics
Then what does appear to
us as random ?
It is the incompetency of our
measurement capacity which makes us
feeling something to happen at random
8
How does STATISTICS help here?
Statistics tries to find out patterns
even in random phenomena
9
How does STATISTICS help here?
Statistics tries to find out patterns
even in random phenomena
Statistics admits the presence of
error and gives mathematically
justified methods to control it
10
How does STATISTICS help here?
Statistics tries to find out patterns
even in random phenomena
Statistics admits the presence of
error and gives mathematically
justified methods to control it
Statistical analysis helps to estimate
the values in forward / backwards
time direction
11
Statistics over time line
1654 – Pascal and Fermat : mathematical theory of probability
1657 – Huygens : first book on mathematical probability
1693 – Halley : first mortality tables
1713 – Posthumous : law of large numbers
1761 – Thomas Bayes : Bayes’ theorem
1786 – Playfair : graphs and bar charts of data,
12
Statistics over time line
1801 – Gauss : predicts the orbit of Ceres using a line of best fit
1805 – Legendre : method of least squares for fitting a curve
1814 – Laplace : generating functions
1866 – Venn : frequency interpretation of probability.
1880 – Thiele : Brownian motion, likelihood function, cumulants.
1888 – Galton : correlation
13
Statistics over time line
1900 – Bachelier : stock price movements as a stochastic process,
1908 – Student's t-distribution for the mean of small samples
1928 – Tippett and Fisher introduce extreme value theory
1933 – A. N. Kolmogorov : axiomatic probability
1935 – R. A. Fisher : Design of Experiments
1937 – Neyman and Pearson : confidence interval and statistical testing
1946 – R. T. Cox :probability from simple logical assumptions,
1948 – Shannon : entropy
1953 – Nicholas Metropolis : thermodynamic simulated annealing
1979 - Bradley Efron : Bootstrap
14
Indians in Statistics
P. C. Mahalanobis (1893-1972) : D^2 distance
A. K. Bhattacharya (1955-1996): Bhattacharya coefficient
C. R. Rao (1920-): RC-lower bound, Row-Blackwell theorem
D. Basu (1924–2001): Basu’s theorem
K. R. Parthasarathy (1953-): quantum stochastic calculus
S. R. Varadhan (1940-): large deviation
15
Statistics in Production
GOAL: Best possible
production method
under constraint
Design of experiment
Complete block design
Incomplete block
design…… etc.
16
Statistical Quality Control
GOAL: Maintaining
product quality in
a quantitive way.
Mean-chart
Sd-chart
Defect-chart …. etc
17
Regression Analysis
GOAL: Construction
of dependency model
and predicting for
some unknown value
Linear regression
Polynomial regression
Response Surface
Logistic regression
18
Survival analysis & Reliability
GOAL: Optimal use of
resource which is about
to fail
Life-time distributions
Cox- model
Insurance policies
Censoring scheme
19
Bio-medical Statistic
GOAL: comparing
efficacies of drugs
with minimal effect
of inferior drug
Clinical trials
Adaptive design
Sequential design
Adverse drug reaction
Genetics
20
Time series analysis
GOAL: Future
prediction / past
estimation
Weather forecast
Stock market analysis
Business policy making
Astronomical statistics
Signal processing
21
Random Graph : Complex network
GOAL: Study the
evolution of a
growing community
Social media
growth
Advertisement
policies
22
Statistical learning
GOAL: Training a
machine to do
classification online/
offline
Clustering/
classification
Image processing
Shape analysis
Behaviour study
Recent & advance
topics
Algebraic statistics
Statistics on
manifold
Random filed
theory
Functional data
analysis
Examples of AI-ML problems
Automated data entry
Detecting Spam
Product recommendation
Medical Diagnosis
Corrective and preventive maintenance
Speech detection
Image / video recognition (Computer Vision)
Natural Language Processing
Video / online game
Etc…….
How will the Clustering / Classification / Regression ( Forecasting ) How will the
machine decide ? machine function ?
In which space and
how to do the analysis ?
Statistical Inference
Computerised
1. Estimation Automation
2. Testing of hypothesis
3. Interval estimation Mathematical Structure
1. Data Storage
4. Model selection ….etc
1. Linear Algebra 2. Data retrieval
In 2. Functional analysis 3. Memory estimation
Paraparetic / 3. Differential geometry 4. Signal transmission
Non parametric/ 4. Topology 5. Data visualisation
Bayesian 5. Graph Theory 6. Automated service
paradigm 6. Optimisation …. Etc 7. Automated learning
…etc
Statistics &
Information
B Banerjee
Statistics & Information
Outline
Introduction
Distribution Buddhananda Banerjee
Estimation
Entropy Department of Mathematics
Centre for Excellence in Artificial Intelligence
Indian Institute of Technology Kharagpur
bbanerjee@maths.iitkgp.ac.in
19.02.2020
B Banerjee Statistics & Information IIT Kharagpur 1 / 31