0% found this document useful (0 votes)

15 views52 pages

DTB Notes

Uploaded by

simranjeet kaur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views52 pages

DTB Notes

Uploaded by

simranjeet kaur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 52

Institute of innovation in technology and management have a clear idea about the terminologies uses, the statistical units

the statistical units employed,

the research methodology and the size of the sample. Primary data may either
be internal or external to the organization.
Study Material

Secondary Data: refers to the data that the investigator collects from another
Decision Techniques in Business management
source. Past investigators or agents collect data required for their study. The
investigator is the first researcher or statistician to collect this data. Moreover,
Faculty: Dr.Simranjeet Kaur the investigator does not have a clear idea about the intricacies of the data.
There may be ambiguity in terms of the sample size and sample technique.
There may also be unreliability with respect to the accuracy of the data.
Unit - I - Statistics

PRIMARY DATA – METHODS OF COLLECTION OF DATA

1. Definition of Statistics

(I) Direct Personal Investigation

Statistics can be defined as a set of observation of existing data and using the
information for future use.
Consists of the collection of data by the investigator in a direct manner. The
investigator (or researcher) is responsible for personally approaching a
Statistic is a basic tool which helps in analysing the given data and respondent and investigating the research and gather appropriate information.
interpreting the results for future calculations. In other words, the researcher himself enters the field and solicits data that he
requires to take the research forward. Thus, this method of data collection
Tools and techniques involved in statistical analysis are ensures first-hand information. This data is all the more reliable for intensive
research. But in extensive research, this data is inadequate and proves to be
unreliable. This method of collection of data is time-consuming. Hence, it
1. Measure of central tendency like mean, median, mode. tends to get handicapped when there is lack of time resource. However, the
2. Measure of variance greatest demerit is that this method is very subjective in nature and is not
3. Graphical tools • Bar graph • Histogram • Pie chart suitable for objective based extensive research.

2. Importance and Limitation of Statistics (II) Indirect Oral Interview

Importance Of statistics Consists of the collection of data by the investigator in an indirect manner.
The investigator (or enumerator) approaches (by telephonic interviews) an
1. Helps in research indirect respondent who possesses the appropriate information for the
2. Helps in prediction research. Thus, this method of data collection ensures first-hand information
3. Easy analysis because the interviewers can cross-question for the right and appropriate
4. Future control information.
5. Supports Decision Making
(III) Mailed Questionnaire
Limitations of statistics
Consists of mailing a set or series of questions related to the research. The
1. Calculation error in results respondent answers the questionnaire and forwards it back to the investigator
2. Duration of data after marking his/her responses. This method of collection of data has proven
3. Time consuming to be timesaving. It is also a very cost-efficient manner of collecting the
4. Not suitable for qualitative factors required data. An investigator who has the access to the internet and an email
5. Expensive account can undertake this method of data collection. The researcher can only
investigate those respondents who also have access to the internet and an
email account. This remains the only major restriction of this method.
3. Collection of data and formation of frequency distribution

(IV) Schedules
Collection of data and formation of frequency distribution

Scheduling involves a face-to-face situation with the respondents. In this

Collection of data is a statistical requirement. Statistics are a set or series of method of collecting data, the interviewer questions the respondent according
numerical data that acts as a facilitating factor of policymaking. In other to the questions mentioned in a form. This form is known as a schedule. This
words, numerical data establishes Statistics. Numerical data undergoes is different than a questionnaire. A questionnaire is personally filled by the
processing and manipulations before it aids the process of decision making. respondents and the interviewer may or may not be physically present.
Hence, numerical data are the raw materials to statistics. These raw materials Whereas the schedule is filled by the enumerator or interviewer after asking
can originate from various sources. Statisticians and analysts collect these data the respondent his/her answer to a specific question. And in scheduling
in different methods. method of collecting data, the interviewer or enumerator is physically present.

(V) Local agencies

TYPES OF DATA In this method, the information is not directly or indirectly collected by either
the interviewer of the enumerator. Instead, the interviewer hires or employs a
There are 2 types of data. Discussed below are the types of data. local agency to work for him/her and help in gathering appropriate
information. These local agents are often known as correspondents as well.
Correspondents are only responsible for gathering accurate and reliable
Primary Data: refers to the data that the investigator collects for the very first information. They work according to their preference and adopt different
time. This type of data has not been collected either by this or any other methods to do so.
investigator before. A primary data will provide the investigator with the most
reliable first-hand information about the respondents. The investigator would
1
SECONDARY DATA – SOURCES OF DATA 4. Graphic presentation of frequency distribution – graphics, Bars, Histogram,
Diagrammatic
(I) Published Sources

There are many national organizations, international agencies and official

publications that collect various statistical data. They collect data related to
business, commerce, trade, prices, economy, productions, services, industries,
currency and foreign affairs. They also collect information related to various
(internal and external) socio-economic phenomena and publish them. These
publications contain statistical reports of various kinds. Central Government
Official Publication, Publications of Research Institutions, Committee Reports
and International Publications are some published sources of secondary data. Bar Graph

(II) Unpublished Sources

Some statistical data are not always a part of publications. Such data are
stored by institutions or a private firm. Researchers often make use of these
unpublished data in order to make their research all the more original.

FREQUENCY DISTRIBUTION

Frequency distribution is a representation, either in a graphical or tabular

format, that displays the number of observations within a given interval. The Histogram
intervals must be mutually exclusive and exhaustive, and the interval size
depends on the data being analyzed and the goals of the analyst. Frequency
distributions are typically used within a statistical context. 5. Measures of central tendency – mean, median and mode

Measures of central tendency

It shows various methods of analyzing and comparing given data set.

Mean
As a statistical tool, a frequency distribution provides a visual representation
for the distribution of a particular variable. Analysts often use it to show or It is the average of the given data.
illustrate the data collected in a sample. For example, the height of children
can be split into several different categories or ranges. In measuring the height
Mean = sum of the observations/Number of observations
of 50 children, some are tall, and some are short, but there is a high
probability of a higher frequency or concentration in the middle range. The
most important factors are that the intervals used must be non-overlapping and Median
must contain all of the possible observations.
It is middle value of the data which is calculated by dividing the data in two
Visual Representation equal parts.

Frequency distributions can be presented as a frequency table, a histogram or Median of discrete series= (𝑛+1)/2(n+1)/2 the observation
a bar chart. Both histograms and bar charts provide a visual display using
columns, with the y-axis representing the frequency count, and the x-axis
representing the variable to be measured. In this example, the y-axis is the Median of Continuous series= (𝑛)/2(n)/2 th observation
number of children, and the x-axis is the height. In general, the chart will
show a normal distribution, which means that the majority of occurrences, or Mode
in this case children of a certain height, will fall in the middle column. In a
histogram, the height of the column represents the range of values for that
variable. Mode is the most occurring observation

= 𝑙𝑜𝑤𝑒𝑟𝑙𝑖𝑚𝑖𝑡+𝑖(𝐹1−𝐹0)/(2𝐹1−𝐹0−𝐹2)lowerlimit+i(F1−F0)/
Frequency Distributions Used in Trading Mode, continuous series

(2F1−F0−F2)
Frequency distributions are not commonly used in the world of investments.
However, traders who follow Richard D. Wyckoff, a pioneering trader in the
early 20th century, use an approach to trading based on frequency distribution. Mode = 3Median - 2Mean
Investment houses still use the approach, which requires considerable
practice, to teach traders. The frequency chart is referred to as a point-and- 6. partition values – quartiles, deciles and percentiles
figure chart and was created out of a need for floor traders to take note of
price action and to identify trends. The y-axis is the variable measured, and
the x-axis is the frequency count. Each change in price action is denoted in Quartiles
X’s and O’s. Traders interpret it as an uptrend when three X’s emerge; in this
case, demand has overcome supply. In the reverse situation, when the chart There are three quartiles denoted by Q1, Q2 and Q3 divides the frequency
shows three O’s, it indicates that supply has overcome demand. distribution in to four equal

2
parts Quartiles for Discrete Series (grouped data)

Step 1: Find cumulative frequencies

Step 2: Find ((N+1)/ 4)

That is 25 percent of data will lie below Q1, 50 percent of data below Q2 and
75 percent below Q3. Here Q2 is called the Median. Quartiles are obtained in Step 3: See in the cumulative frequencies, the value just greater than ((N+1)/
almost the same way as median 4) the corresponding value of x is Q1

Quartiles for Raw or Ungrouped data: Step 4: Find 3((N+1)/ 4)

If the data set consist of n items and arranged in ascending order, then Step 5: See in the cumulative frequencies, the value just greater than 3((N+1)/
4) then the corresponding value of x is Q3.
Example 5.30
Example 5.31: Compute Q1 and Q3 for the data relating to age in years
of 543 members in a village
Compute Q1 and Q3 for the
data relating to the marks of
8 students in an examination
given below 25, 48, 32, 52,
21, 64, 29, 57 Solution:

Solution:

n=8

Arrange the values in ascending order

21, 25, 29, 32, 48, 52, 57, 64 we

Quartiles for Continuous series (grouped data)

Step 1: Find cumulative frequencies

Step 2: Find (N/4)

have

3
Step 3: Q1 class is the class interval corresponding to the value of the Solution:
cumulative frequency just greater than (N/4)

Step 4 : Q3 class is the class interval corresponding to the value of the

cumulative frequency just greater than 3 (N/4)

Example 5.32: Calculate the quartiles Q1 and Q3 for wages of the labors
given below

Deciles

Deciles are similar to quartiles. Quartiles divides ungrouped data into four
quarters and Deciles divide data into 10 equal parts.

Example 5.33 Find the D6 for the following data

11, 25, 20, 15, 24, 28, 19, 21

Solution:

Arrange in an ascending order

4
11,15,19,20,21,24,25,28

Example 5.34: Calculate D5 for the frequency distribution of monthly

income of workers in a factory

Solution:

Percentiles

The percentile values divide the frequency distribution into 100 parts each
containing 1 percent of the cases. It is clear from the definition of quartiles,
deciles and percentiles

Relationship

P25 = Q1

P50 = Median = Q2

P75 = 3rd quartile = Q3

Example 5.35: The following is the monthly income (in 1000) of 8 persons
working in a factory. Find P30 income value

10,14, 36, 25, 15, 21, 29, 17

Solutions: Arrange the data in an ascending order.

n=8

5
10,14,15,17,21,25,29,36 Variation is a way to show how data is dispersed or spread out. Several
measures of variation are used in statistics.

Different Measures of Variation: -

 The Range
 Quartile
 Interquartile Range
 Variance

7.1. Range

Range

Range is defined as difference between the largest and smallest

observations in the data set.

Raw Data: Range is defined as difference between the largest and smallest
Example 5.36: Calculate P61 for the following data relating to the height observations in the data set. Range(R) = Largest value in the data set (L) –
of the plants in a garden Smallest value in the data set(S)

R= L - S

Grouped Data: For grouped frequency distribution of values in the data set,
the range is the difference between the upper-class limit of the last class
interval and the lower class limit of first class interval.

Coefficient of Range

The relative measure of range is called the coefficient of range.

Coefficient of Range =(L-S) / (L+S)

Example 6.2 The following data relates to the heights of 10 students (in
CMS) in a school.

Calculate the range and coefficient of range.

158, 164, 168, 170, 142, 160, 154, 174, 159, 146

Solution:

L=142 S=174

Range = L – S = 174 – 142 = 32

Coefficient of range = (L-S) ( L+S)

= (174-142)/(174+142) = 32/316 = 0.101

Example 6.3: Calculate the range and the co-efficient of range for the
marks obtained by 100 students in a school.

Solution:
Solution:

L = Upper limit of highest class = 75

7. Measures of variation

6
S = lower limit of lowest class = 60 standard deviation tells you the percentage of observations that fall specific
distances from the mean. However, this doesn’t work for skewed
distributions, and the IQR is a great alternative.
Range = L-S = 75-60 = 15

The interquartile range is equal to Q3 minus Q1.

Coefficient of range = (L-S) ( L+S)
For example,

=15/ (75+60) = 0.111

consider the following numbers: 1, 3, 4, 5, 5, 6, 7, 11. Q1 is the middle value
in the first half of the data set. Since there are an even number of data points
Merits: in the first half of the data set, the middle value is the average of the two
middle values; that is, Q1 = (3 + 4)/2 or Q1 = 3.5. Q3 is the middle value in
the second half of the data set. Again, since the second half of the data set has
· Range is the simplest measure of dispersion. an even number of observations, the middle value is the average of the two
middle values; that is, Q3 = (6 + 7)/2 or Q3 = 6.5. The interquartile range is
· It is well defined, and easy to compute. Q3 minus Q1, so IQR = 6.5 - 3.5 = 3.

· It is widely used in quality control, weather forecasting, stock market 7.3. Quartile Deviation
variations etc.
Quartile Deviation

Limitations:
Quartile Deviation is defined as QD = ½ (Q3 – Q1) It may also be called as
semi- inter quartile.
· The calculations of range are based on only two values – largest value and
smallest value.
where Q1 and Q3 are the first and third quartiles of the distribution
respectively and Q3 – Q1 is called as inter quartile range.
· It is largely influenced by two extreme values.
(I) Relative measures for QD
· It cannot be computed in the case of open-ended frequency distributions.
Quartile deviation is an absolute measure of dispersion. The relative measure
· It is not suitable for further mathematical treatment. corresponding to this measure, called the coefficient of quartile deviation is
calculated as follows:
7.2. IQR
Coefficient of quartile deviation can be used to compare the degree of
THE INTERQUARTILE RANGE (IQR) variation in different distributions.

The interquartile range is the middle half of the data. To visualize it, think (ii) Computation of Quartile Deviation
about the median value that splits the dataset in half. Similarly, you can divide
the data into quarters. Statisticians refer to these quarters as quartiles and
denote them from low to high as Q1, Q2, Q3, and Q4. The lowest quartile The process of computing quartile deviation is very simple since we just have
(Q1) contains the quarter of the dataset with the smallest values. The upper to compute the values of the upper and lower quartiles that is Q3 and Q1
quartile (Q4) contains the quarter of the dataset with the highest values. The respectively.
interquartile range is the middle half of the data that is in between the upper
and lower quartiles. In other words, the interquartile range includes the 50% Example 8.16
of data points that fall in Q2 and

Calculate the value of quartile deviation and its coefficient from the following
The IQR is the red area in the graph below. data

Solution:

The interquartile range is a robust measure of variability in a similar manner

that the median is a robust measure of central tendency. Neither measure is
influenced dramatically by outliers because they don’t depend on every value.
Additionally, the interquartile range is excellent for skewed distributions, just
like the median. As you’ll learn, when you have a normal distribution, the
7
Marks are arranged in ascending Compute Quartile deviation from the following
order data

Hence coefficient of QD = 0.455

Example 8.17

Compute coefficient of quartile deviation from the following

Solution:

data

Solution:

Example 8.18
7.4. Standard Deviation

Standard Deviation

Consider the following data sets.

8
N = total number of observation (or elements) in the population

c = width of class interval

x = mid-value of each class interval where A is an assumed A.M.

It is obvious that the range for the three sets of data is 8. But a careful look at Variance:
these sets clearly shows the numbers are different and there is a necessity for a
new measure to address the real variations among the numbers in the three Sum of the squares of the deviation from mean is known as Variance.
data sets. This variation is measured by standard deviation. The idea of
standard deviation was given by Karl Pearson in 1893.
The square root of the variance is known as standard deviation.

Definition
Example 6.5

‘Standard deviation is the positive square root of average of the deviations

of all the observation taken from the mean.’ It is denoted by a Greek letter The following data gives the number of books taken in a school library in 7
v. days find the standard deviation of the book taken

a. Ungrouped data 7, 9, 12, 15, 5, 4, 11

x1, x2 , x3 … xn are the ungrouped data then standard deviation is calculated Solution:
by
Actual mean
method

b. Grouped Data (Discrete)

Where, f = frequency of each class interval

N = total number of observation (or elements) in the population

x = mid – value of each class interval

where A is an assumed A.M.

c. Grouped Data (continuous)

Merits:

· The value of standard deviation is based on every observation in a set of

data.

Where, f = frequency of each class interval · It is less affected by fluctuations of sampling.

9
· It is the only measure of variation capable of algebraic treatment.

Limitations:

· Compared to other measures of dispersion, calculations of standard deviation

are difficult.

· While calculating standard deviation, more weight is given to extreme values

and less to those near mean.

· It cannot be calculated in open intervals.

· If two or more data set were given in different units, variation among those
data set cannot be compared.

Example 6.6

Raw Data:

Weights of children admitted in a hospital is given below calculate the

standard deviation of weights of children.

13, 15, 12, 19, 10.5, 11.3, 13, 15, 12, 9

Solution:

Example 6.7

Find the standard deviation of the first ‘n’ natural numbers.

Solution:

The first n natural numbers are 1, 2, 3,, n. The sum and the sum of squares of
these n numbers

10
are

Example 6.8

The wholesale price of a commodity for seven consecutive days in a month is

as
follows:
Example 6.9

The mean and standard deviation from 18 observations is 14 and 12

respectively. If an additional observation 8 is to be included, find the corrected
mean and standard deviation.
Calculate the variance and standard deviation.

Solution:
Solution:

The computations for variance and standard deviation are cumbersome when
x values are large. So, another method is used, which will reduce the
calculation time. Here we take the deviations from an assumed mean or
arbitrary value A such that d = x – A

In this question, if we take deviation from an assumed A.M. =255. The

calculations then for standard deviation will be as shown in below Table.

11
7.5. Lorenz Curve

LORENZ CURVE

The Lorenz curve is a graphical representation of income inequality or wealth

inequality developed by American economist Max Lorenz in 1905. The graph
plots percentiles of the population on the horizontal axis according to income
or wealth. It plots cumulative income or wealth on the vertical axis, so that an
x-value of 45 and a y-value of 14.2 would mean that the bottom 45% of the
population controls 14.2% of the total income or wealth.

1. The Lorenz curve shows the percentage of total income earned by

cumulative percentage of the population.
Example 6.10 2. In a perfectly equal society, the “poorest” 25% of the population
would earn 25% of the total income, the “poorest” 50% of the
A study of 100 engineering companies gives the following information population would earn 50% of the total income and the Lorenz
curve would follow the path of the 45° line of equality.
3. As inequality increases, the Lorenz curve deviates from the line of
equality; the “poorest” 25% of the population may earn 10% of the
total income; the “poorest” 50% of the population may earn 20%
of the total income and so on.

Calculate the standard of the profit earned. The Lorenz curve is often accompanied by a straight diagonal line with a
slope of 1, which represents perfect equality in income or wealth distribution;
the Lorenz curve lies beneath it, showing the actual distribution. The area
Solution: between the straight line and the curved line, expressed as a ratio of the area
under the straight line, is the Gini coefficient, a measurement of inequality.

While the Lorenz curve is most often used to represent economic inequality, it
can also demonstrate unequal distribution in any system. The farther away the
curve is from the baseline, represented by the straight diagonal line, the higher
the level of inequality. In economics, the Lorenz curve denotes inequality in
the distribution of either wealth or income; these are not synonymous since it
is possible to have high earnings but zero or negative net worth, or low
earnings but a large net worth.

12
Then Y is completely determined by X, so that X and Y are perfectly
dependent, but their correlation is zero; they are uncorrelated. However, in the
special case when X and Y are jointly normal, uncorrelatedness is equivalent
Unit - II - Correlation Analysis
to independence.

1. Correlation Coefficient
If we have a series of n measurements of X and Y written as xi and yi where i
= 1, 2, …, n, then the sample correlation coefficient can be used to estimate
CORRELATION: the population Pearson correlation r between X and Y.

In statistics, dependence is any statistical relationship between two random

variables or two sets of data. Correlation refers to any of a broad class of where x and y are the sample means of X and Y, and sx and sy are the sample
statistical relationships involving dependence. Familiar examples of standard deviations of X and Y.
dependent phenomena include the correlation between the physical statures of
parents and their offspring, and the correlation between the demand for a This can also be written as:
product and its price. Correlations are useful because they can indicate a
predictive relationship that can be exploited in practice. For example, an
electrical utility may produce less power on a mild day based on the
correlation between electricity demand and weather. In this example there is a
causal relationship, because extreme weather causes people to use more
electricity for heating or cooling; however, statistical dependence is not
sufficient to demonstrate the presence of such a causal relationship.
If x and y are results of measurements that contain measurement error, the
realistic limits on the correlation coefficient are not −1 to +1 but a smaller
Formally, dependence refers to any situation in which random variables do not
range.
satisfy a mathematical condition of probabilistic independence. In loose
usage, correlation can refer to any departure of two or more random variables
from independence, but technically it refers to any of several more specialized 2. Assumptions of correlation analysis
types of relationship between mean values. There are several correlation
coefficients, often denoted ρ or r, measuring the degree of correlation. The
Assumptions of Correlation Coefficient
most common of these is the Pearson correlation coefficient, which is
sensitive only to a linear relationship between two variables. Other correlation Normality means that the data sets to be correlated should approximate the
coefficients have been developed to be more robust than the Pearson normal distribution. In such normally distributed data, most data points tend to
correlation that is, more sensitive to nonlinear relationships. Mutual hover close to the mean.
information can also be applied to measure dependence between two
variables.
Homoscedascity comes from the Greek prefix hom, along with the Greek
word skedastikos, which means ‘able to disperse’. Homoscedascity means
Pearson's correlation coefficient: ‘equal variances. It means that the size of the error term is the same for all
values of the independent variable. If the error term, or the variance, is smaller
He most familiar measure of dependence between two quantities is the
for a particular range of values of independent variable and larger for another
Pearson product-moment correlation coefficient, or "Pearson's correlation
range of values, then there is a violation of homoscedascity. It is quite easy to
coefficient", commonly called simply "the correlation coefficient". It is
check for homoscedascity visually, by looking at a scatter plot. If the points
obtained by dividing the covariance of the two variables by the product of
lie equally on both sides of the line of best fit, then the data is homoscedastic.
their standard deviations. Karl Pearson developed the coefficient from a
similar but slightly different idea by Francis Galton.
Linearity simply means that the data follows a linear relationship. Again, this
can be examined by looking at a scatter plot. If the data points have a straight
The population correlation coefficient ρX,Y between two random variables X
line (and not a curve) relationship, then the data satisfies the linearity
and Y with expected values μX and μY and standard deviations σX and σY is
assumption.
defined

Continuous variables are those that can take any value within an interval.
as: Ratio variables are also continuous variables. To compute Karl Pearson’s
Coefficient of Correlation, both data sets must contain continuous variables. If
even one of the data sets is ordinal, then Spearman’s Coefficient of Rank
where E is the expected value operator, cov means covariance, and corr a
Correlation would be a more appropriate measure.
widely used alternative notation for the correlation coefficient.

Paired observations mean that every data point must be in pairs. That is, for
The Pearson correlation is defined only if both of the standard deviations are
every observation of the independent variable, there must be a corresponding
finite and nonzero. It is a corollary of the Cauchy–Schwarz inequality that the
observation of the dependent variable. We cannot compute correlation
correlation cannot exceed 1 in absolute value. The correlation coefficient is
coefficient if one data set has 12 observations and the other has 10
symmetric: corr(X, Y) = corr(Y,X).
observations.

The Pearson correlation is +1 in the case of a perfect direct (increasing) linear

No outliers must be present in the data. While statistically there’s no harm if
relationship (correlation), −1 in the case of a perfect decreasing (inverse)
the data contains outliers, they can significantly skew the correlation
linear relationship (autocorrelation), and some value between −1 and 1 in all
coefficient and make it inaccurate. When does a data point become an outlier?
other cases, indicating the degree of linear dependence between the variables.
In general, a data point thats beyond +3.29 or -3.29 standard deviations away,
As it approaches zero there is less of a relationship (closer to uncorrelated).
it is considered to be an outlier. Outliers are easy to spot visually from the
The closer the coefficient is to either −1 or 1, the stronger the correlation
scatter plot.
between the variables.

3. coefficients of determination and correlation

If the variables are independent, Pearson's correlation coefficient is 0, but the
converse is not true because the correlation coefficient detects only linear
dependencies between two variables. For example, suppose the random COEFFICIENT OF DETERMINATION
variable X is symmetrically distributed about zero, and Y = X2.
13
The coefficient of determination (denoted by R2) is a key output of regression When X and Y are linearly related and (X, Y) has a bivariate normal
analysis. It is interpreted as the proportion of the variance in the dependent distribution, the co-efficient of correlation between X and Y is defined as
variable that is predictable from the independent variable.

1.The coefficient of determination is the square of the correlation (r) between

predicted y scores and actual y scores; thus, it ranges from 0 to 1.

2.With linear regression, the coefficient of determination is also equal to the

square of the correlation between x and y scores.
This is also called as product moment correlation co-efficient which was
defined by Karl Pearson.
3.An R2 of 0 means that the dependent variable cannot be predicted from the
independent variable.
Based on a given set of n paired observations (xi, yi), i=1,2, … n the sample
correlation co-efficient between X and Y can be calculated from
4.An R2 of 1 means the dependent variable can be predicted without error
from the independent variable.

5.An R2 between 0 and 1 indicates the extent to which the dependent variable
is predictable. An R2 of 0.10 means that 10 percent of the variance in Y is
predictable from X; an R2 of 0.20 means that 20 percent is predictable; and so
on.

The formula for computing the coefficient of determination for a linear

regression model with one independent variable is given below.
or, equivalently

Coefficient of determination. The coefficient of determination (R2) for a

linear regression model with one independent variable is:

R2 = {( 1 / N ) * Σ [ (xi – x) * (yi – y) ] / (σx * σy ) }2

where N is the number of observations used to fit the model, Σ is the

summation symbol, xi is the x value for observation i, x is the mean x value,
yi is the y value for observation i, y is the mean y value, σx is the standard
deviation of x, and σy is the standard deviation of y.
2. Properties
4. Measurement of Correlation
1. The correlation coefficient between X and Y is same as the
There are two types of for measurement of correlation- correlation coefficient between Y and X (i.e, rxy = ryx).
2. The correlation coefficient is free from the units of measurements
of X and Y
 Karl pearson Method 3. The correlation coefficient is unaffected by change of scale and
 Spearman's Rank origin.
Correlation
Thus, if ui = [xi – A] /c and vi = [yi – B] /d with c ≠ 0 and d ≠ 0 i=1,2, …, n
4.1. Karl Person's Methods

KARL PEARSON’S CORRELATION COEFFICIENT

When there exists some relationship between two measurable variables, we

compute the degree of relationship using the correlation coefficient.

Co-variance

Let (X, Y) be a bivariable normal random variable where V(X) and V(Y)
exists. Then, covariance between X and Y is defined as where A and B are arbitrary values.

cov(X, Y) = E[(X-E(X))(Y-E(Y))] = E(XY) – E(X)E(Y) Remark 1: If the widths between the values of the variabls are not equal then
take c = 1 and d = 1.

If (xi, yi), i=1,2, …, n is a set of n realisations of (X,Y), then the sample

covariance between X and Y can be calculated from Interpretation

The correlation coefficient lies between -1 and +1. i.e. -1 ≤ r ≤ 1

· A positive value of ‘r’ indicates positive correlation.

· A negative value of ‘r’ indicates negative correlation

1. Karl Pearson’s coefficient of correlation
14
· If r = +1, then the correlation is perfect positive

· If r = –1, then the correlation is perfect negative.

· If r = 0, then the variables are uncorrelated.

· If r ≥ 0.7 then the correlation will be of higher degree. In interpretation we

use the adjective ‘highly’

· If X and Y are independent, then rxy = 0. However, the converse need not be
true.

Example 4.1 The following data gives the heights (in inches) of father and his
eldest son. Compute the correlation coefficient between the heights of fathers
and sons using Karl Pearson’s
method.

Solution: Let x denote height of father and y denote height of son. The data is
on the ratio scale.

We use Karl Pearson’s method.

Note: The correlation coefficient computed by using direct method and short-
cut method is the same.

Example 4.2

Calculation The following are the marks scored by 7 students in two tests in a subject.
Calculate coefficient of correlation from the following data and interpret.

Solution:

Let x denote marks in test-1 and y denote marks in test-

Heights of father and son are positively correlated. It means that on the
average, if fathers are tall then sons will probably tall and if fathers are short,
probably sons may be short.
2.

Short-cut method

Let A = 68, B = 69, c = 1 and d = 1

15
However, if we compute the linear correlation r for such data, it may be
zero implying age and health care are uncorrelated, but non-linear
correlation is present.

2. Spurious Correlation: The word ‘spurious’ from Latin means ‘false’ or

‘illegitimate’. Spurious correlation means an association extracted from
correlation coefficient that may not exist in reality.

There is a high positive correlation between test -1 and test-2. That is those
who perform well in test-1 will also perform well in test-2 and those who 4.2. Spearman's Rank Correlation
perform poor in test-1 will perform poor in test- 2.
SPEARMAN’S RANK CORRELATION COEFFICIENT

The students can also verify the results by using shortcut method. If the data are in ordinal scale, then Spearman’s rank correlation coefficient is
used. It is denoted by the Greek letter ρ (rho).

3. Limitations of Correlation
Spearman’s correlation can be calculated for the subjectivity data also, like
competition scores. The data can be ranked from low to high or high to low by
Although correlation is a powerful tool, there are some limitations in using it: assigning ranks.

1. Outliers (extreme observations) strongly influence the correlation Spearman’s rank correlation coefficient is given by the formula
coefficient. If we see outliers in our data, we should be careful
about the conclusions we draw from the value of r. The outliers
may be dropped before the calculation for meaningful conclusion.
2. Correlation does not imply causal relationship. That a change in
one variable causes a change in another.

where Di = R1i – R2i

R1i = rank of i in the first set of data

R2i = rank of i in the second set of data and

n = number of pairs of observations

Interpretation
NOTE

Spearman’s rank correlation coefficient is a statistical measure of the strength

1. Uncorrelated: Uncorrelated (r = 0) implies no ‘linear relationship’. But
of a monotonic (increasing/decreasing) relationship between paired data. Its
there may exist non-linear relationship (curvilinear relationship).
interpretation is similar to that of Pearson’s. That is, the closer to the ±1
means the stronger the monotonic relationship.
Example: Age and health care are related. Children and elderly people
need much more health care than middle aged persons as seen from the
following graph.

16
Solution:

Example 4.3

Two referees in a flower beauty competition rank the 10 types of flowers as

follows:

Use the rank correlation coefficient and find out what degree of agreement is
between the referees.

Solution:

Interpretation: This perfect negative rank correlation (- 1) indicates that

scorings in the subjects, totally disagree. Student who is best in Tamil is
weakest in English subject and vice-versa.

Example 4.5

Quotations of index numbers of equity share prices of a certain joint stock

company, and the prices of preference shares are given
below.

Using the method of rank correlation determine the relationship between

equity shares and preference shares prices.

Interpretation: Degree of agreement between the referee’s ‘A’ and ‘B’ is

0.636 and they have “strong agreement” in evaluating the competitors.

Example 4.4

Calculate the Spearman’s rank correlation coefficient for the following data.

17
Solution: Solution:

Repetitions of ranks

In Commerce (X), 20 is repeated two times corresponding to ranks 3 and 4.

Therefore, 3.5 is assigned for rank 2 and 3 with m1=2.

In Mathematics (Y), 30 is repeated three times corresponding to ranks 3, 4 and

5. Therefore, 4 is assigned for ranks 3,4 and 5 with m2=3.

Therefore,
Interpretation: There is a negative correlation between equity shares and
preference share prices.

There is a strong disagreement between equity shares and preference share

prices.

Repeated ranks

When two or more items have equal values (i.e., a tie) it is difficult to give
ranks to them. In such cases the items are given the average of the ranks they
would have received. For example, if two individuals are placed in the 8th
place, they are given the rank [8+9] / 2 = 8.5 each, which is common rank to
be assigned and the next will be 10; and if three ranked equal at the 8th place,
they are given the rank [8 + 9 +10] /3 = 9 which is the common rank to be Interpretation: Marks in Commerce and Mathematics are uncorrelated
assigned to each; and the next rank will be 11.
5. Regression
In this case, a different formula is used when there is more than one item
having the same value. Regression

MEANING:

1. Regression means going back and it is a mathematical measure

showing the other in the same or the opposite the average
relationship between two variables
2. Both the variables may be random variables
3. It indicates the cause-and-effect relationship between the variables
and establishes functional relationship.
where mi is the number of repetitions of ith rank 4. Besides verification it is used for the prediction of one value, in
relation to the other given value.
5. Regression coefficient is an absolute figure. If we know the value
Example 4.6 of the independent variable, we can find the value of the dependent
variable
Compute the rank correlation coefficient for the following data of the marks 6. In regression there is no such spurious regression
obtained by 8 students in the Commerce and 7. It has wider application, as it studies linear and nonlinear
Mathematics. relationship between the variables
8. It is widely used for further mathematical treatment.

REGRESSION LINE

Definition: The Regression Line is the line that best fits the data, such that the
overall distance from the line to the points (variable values) plotted on a graph
is the smallest. In other words, a line used to minimize the squared deviations
of predictions is called as the regression line.

18
There are as many numbers of regression lines as variables. Suppose we take X = 0.425 ×(Y-10) + 12
two variables, say X and Y, then there will be two regression lines:
X = 0.425Y- 4.25+12
Regression line of Y on X: This gives the most probable values of Y from the
given values of X.
X = 0.425Y+7.75

Regression line of X on Y: This gives the most probable values of X from the
X on Y
given values of Y. The algebraic expression of these regression lines is called
as Regression Equations. There will be two regression equations for the two
regression lines. Answer X = 0.425Y + 7.75

The correlation between the variables depends on the distance between these The regression Y on X is
two regression lines, such as the nearer the regression lines to each other the
higher is the degree of correlation, and the farther the regression lines to each
r = 0.85, σx = 0.1 and σy = 0.2
other the lesser is the degree of correlation.

Then substituting the values in formula

The correlation is said to be either perfect positive or perfect negative when
the two regression lines coincide, i.e. only one line exists. In case, the
variables are independent; then the correlation will be zero, and the lines of (Y-10) = 0.85 × (0.2/0.1) × (X–12)
regression will be at right angles, i.e. parallel to the X axis and Y axis.
(Y-10) = 0.85 × (2) × (X–12)
Note: The regression lines cut each other at the point of average of X and Y.
This means, from the point where the lines intersect each other the
perpendicular is drawn on the X axis we will get the mean value of X. Y = 1.7 × (X–12) + 10
Similarly, if the horizontal line is drawn on the Y axis, we will get the mean
value of Y. Y = 1.7X–20.4+10

Two Regression lines Y = 1.7X–10.4

X on Y => X = a + by Y on X

Y on X => Y = a + bx Answer Y = 1.7X–10.4

Regression line is the line which gives the best estimate of one variable from ASSUMPTIONS IN REGRESSION
the value of any other given variable. The line gives the average relationship
between the two variables in mathematical form. The line of regression is the Independence: The residuals are serially independent (no autocorrelation).
line which gives the best estimate to the value of one variable for any specific The residuals are not correlated with any of the independent (predictor)
value of the other variable. variables.

To fit Regression equations X on Y and Y on X the following examples are Linearity: The relationship between the dependent variable and each of the
given independent variables is linear.

Ex 1: Fit two regression equation X on Y and Y on X for the following Mean of Residuals: The mean of the residuals is zero.
data.Xbar= 12, Ybar=10, σy= 0.2, σx =0.1 and r = 0.85
Homogeneity of Variance: The variance of the residuals at all levels of the
Solution independent variables is constant.

The regression X on Y is Errors in Variables: The independent (predictor) variables are measured
without error.

Model Specification: All relevant variables are included in the model. No

irrelevant variables are included in the model.

Normality: The residuals are normally distributed. This assumption is needed

for valid tests of significance but not for estimation of the regression
coefficients.
r = 0.85, σx = 0.1 and σy = 0.2
5.1. Meaning
Then substituting the values in formula
Regression is a statistical method used in finance, investing, and other
disciplines that attempts to determine the strength and character of the
Then substituting the values in formula relationship between one dependent variable (usually denoted by Y) and a
series of other variables (known as independent variables)
(X-12) = 0.85 × (0.1/0.2) × (Y-10)
5.2. Assumptions
(X-12) = 0.85 × (0.5) × (Y-10)
19
ASSUMPTIONS IN REGRESSION extended to the multivariate model in which there are two or more
independent variables.
1. Independence: The residuals are serially independent (no
autocorrelation).
ordinary least squares (OLS) are a type of linear least squares method for
2. The residuals are not correlated with any of the independent estimating the unknown parameters in a linear regression model. OLS chooses
(predictor) variables. the parameters of a linear function of a set of explanatory variables by the
3. Linearity: The relationship between the dependent variable and principle of least squares: minimizing the sum of the squares of the
each of the independent variables is linear. differences between the observed dependent variable (values of the variable
4. Mean of Residuals: The mean of the residuals is zero. being predicted) in the given dataset and those predicted by the linear
function.
5. Homogeneity of Variance: The variance of the residuals at all
levels of the independent variables is constant.
6. Errors in Variables: The independent (predictor) variables are Geometrically, this is seen as the sum of the squared distances, parallel to the
measured without error. axis of the dependent variable, between each data point in the set and the
corresponding point on the regression surface – the smaller the differences,
7. Model Specification: All relevant variables are included in the
the better the model fits the data. The resulting estimator can be expressed by
model. No irrelevant variables are included in the model.
a simple formula, especially in the case of a simple linear regression, in which
8. Normality: The residuals are normally distributed. This there is a single regressor on the right side of the regression equation.
assumption is needed for valid tests of significance but not for
estimation of the regression coefficients.
The OLS estimator is consistent when the regressors are exogenous, and
optimal in the class of linear unbiased estimators when the errors are
5.3. Regression Lines homoscedastic and serially uncorrelated [citation needed]. Under these
conditions, the method of OLS provides minimum-variance mean-unbiased
REGRESSION LINE estimation when the errors have finite variances. Under the additional
assumption that the errors are normally distributed, OLS is the maximum
Definition: The Regression Line is the line that best fits the data, such that the
likelihood estimator.
overall distance from the line to the points (variable values) plotted on a graph
is the smallest. In other words, a line used to minimize the squared deviations
of predictions is called as the regression line. OLS is used in fields as diverse as economics (econometrics), data science,
political science, psychology and engineering (control theory and signal
processing).
There are as many numbers of regression lines as variables. Suppose we take
two variables, say X and Y, then there will be two regression lines:
5.5. Pitfalls and limitations associated with regression and correlation analysis
Regression line of Y on X: This gives the most probable values of Y from the Pitfalls Associated with Regression and Correlation Analysis
given values of X.
The regression analysis as a statistical tool has a number of uses, or utilities
Regression line of X on Y: This gives the most probable values of X from the for which it is widely used in various fields relating to almost all the natural,
given values of Y. physical and social sciences. the specific uses, or utilities of such a technique
may be outlined as under:

The algebraic expression of these regression lines is called as Regression

Equations. There will be two regression equations for the two regression lines. 1. It provides a functional relationship between two or more related
variables with the help of which we can easily estimate or predict
the unknown values of one variable from the known values of
The correlation between the variables depends on the distance between these another variable.
two regression lines, such as the nearer the regression lines to each other the
higher is the degree of correlation, and the farther the regression lines to each 2. It provides a measure of errors of estimates made through the
other the lesser is the degree of correlation. regression line. A little scatter of the observed (actual) values
around the relevant regression line indicates good estimates of the
values of a variable, and less degree of errors involved therein. On
The correlation is said to be either perfect positive or perfect negative when the other hand, a great deal of scatter of the observed values
the two regression lines coincide, i.e. only one line exists. In case, the around the relevant regression line indicates inaccurate estimates
variables are independent; then the correlation will be zero, and the lines of of the values of a variable and high degree of errors involved
regression will be at right angles, i.e. parallel to the X axis and Y axis. therein.
3. It provides a measure of coefficient of correlation between the two
Note: The regression lines cut each other at the point of average of X and Y. variables which can be calculated by taking the square root of the
This means, from the point where the lines intersect each other the product of the two regression coefficients e. r = √(b×y. byx)
perpendicular is drawn on the X axis we will get the mean value of X. 4. It provides a measure of coefficient of the determination which
Similarly, if the horizontal line is drawn on the Y axis, we will get the mean speaks of the effect of the independent variable (explanatory, or
value of Y. regressing variable) on the dependent variable (explained or
regressed variable) which in its turn give us an idea about the
5.4. Ordinary least square method of regression predictive values of the regression analysis. This coefficient of
determination is computed by taking the product of the two
Ordinary least square method of regression: regression coefficients e. r2 = bxy. Byx The greater the value of
the Coefficient of Determination (r2), the better is the fit, and more
Ordinary least squares (OLS) regression is a statistical method of analysis that useful are the regression equations as the estimating devices.
estimates the relationship between one or more independent variables and a 5. It provides a formidable tool of statistical analysis in the field of
dependent variable; the method estimates the relationship by minimizing the business and commerce where people are interested in predicting
sum of the squares in the difference between the observed and predicted the future events viz.: consumption, production, investment, prices,
values of the dependent variable configured as a straight line. In this entry, sales, profits, etc. and success of businessmen depends very much
OLS regression will be discussed in the context of a bivariate model, that is, a on the degree of accuracy in their various estimates.
model in which there is only one independent variable (X ) predicting a 6. It provides a valuable tool for measuring and estimating the cause-
dependent variable ( Y ). However, the logic of OLS regression is easily
and-effect relationship among the economic variables that

20
constitute the essence of economic theory and economic life. It is Don’t Be Afraid to Reverse: Sometimes, a decision made needs to be
highly used in the estimation of Demand curves, Supply curves, revisited. A business move that was ideal in January might not be as effective
Production functions, Cost functions, Consumption functions etc. in July. Evaluating previous business decisions and choosing to modify or
In fact, economists have propounded many types of production completely reverse that decision is acceptable. Although long-term, permanent
function by fitting regression lines to the input and output data. decisions are typically the goal, sometimes, you have to think short-term and
7. This technique is highly used in our day-to-day life and then revisit the issue at a later date.
sociological studies as well to estimate the various factors viz.
birth rate, death rate, tax rate, yield rate, etc. Gray Areas: Not every decision is black and white. In some cases, you have
8. Last but not the least, the regression analysis technique gives us an to look at the gray area, as well. This is where having a strong management
idea about the relative variation of a series. team or additional expertise comes into play. These people can help a business
owner see all sides of an issue, which improves the chances of making a keen
business decision that’s in the company’s best interest.
Limitations Associated with Regression and Correlation Analysis

Despite the above utilities and usefulness, the technique of regression analysis Assumptions Usage in Business Decision Making
suffers from the following serious limitations:
Assumptions are ideas that we presume to be true before taking decisions.
Assumptions are also made in businesses for developing a strategy, planning
1. It is assumed that the cause-and-effect relationship between the and making decisions. These conjectures are generally standardized as
variables remains unchanged. This assumption may not always disclosure of uncertainty and risk.
hold good and hence estimation of the values of a variable made on
the basis of the regression equation may lead to erroneous and
misleading results. Business, in most cases, occurs in an unsure setting and assumptions are
necessary to move ahead with stratagem. Documenting assumptions help in
2. The functional relationship that is established between any two or
recognizing threats.
more variables on the basis of some limited data may not hold
good if more and more data are taken into consideration. For
example, in case of the Law of Return, the law of diminishing Often, you come up with fresh ideas after brainstorming probable assumptions
return may come to play, if too much of inputs are used with ca that will help you in improvising your business strategies.
view to increase the volume of output.
3. It involves very lengthy and complicated procedure of calculations Here are some of the common types of business assumptions:
and analysis.
4. It cannot be used in case of qualitative phenomenon viz. honesty,
Financial
crime etc.

Even after making profits, it often takes months or even years to pay off the
Unit - III - Linear Programming
initial investments. Nilesh Biswas, Founder of Calcutta Skyline, a realty
consulting firm and My Classroom, an education startup, believes enterprises
1. Concepts and assumptions usage in business decision making linear fail because owners feel they can support the business operations on sales.
programming problem
“Imagining sales volumes to be more than adequate for making a profit in
CONCEPT next few years thereby helping you meet your debt service obligations, is a
chimera. If you have sufficient capital to run the business until break-even,
In business, executives need to make a plethora of decisions. From which reveal that information in the plan. Otherwise, give the investment figures or
company will provide and stock the break room’s vending machines to which loan amount you’ll require to start off the business,” he advised.
costs get cut during a lean time, all require careful thought and planning. It’s
essential that every decision, no matter what its perceived importance may be,
be made with the best intentions and the company’s best interests at heart. Customer Base

The Health of the Business Depends on It: Making snap business decisions The next major assumption is the belief that consumers will be keen to buy
can make or break a business. Changing business practices on a whim or your products or services, generating sufficient sales to make profit for the
because an owner is in a bad mood can have irreversible consequences. All long run. Biswas wants your business plans to exhibit more figures as
business decisions should be carefully considered and should preferably have potential customers than required, as all will not buy from you, and many will
input from several others. The business owner has the final say, but other buy from rivals.
voices should be brought to the table so that all sides of an issue can be
considered. “Definite formulae do not exist to calculate this number; the surplus potential
buyer base should be substantially more than the sales need. If 100 people are
It Takes a Team: A business is only as good as its weakest member, so needed to buy from you each day, plan on the requirement of an exponential
putting together a strong management group is paramount to the business’ number, about five to 10 times the number wanted,” he informed.
ability to make good decisions, which leads to the success of the business. All
members of management need to be on the same page, as to what constitutes Resources
the mission and objectives of the business. Once everybody knows the
mission, deciding how to meet its objectives becomes easier. If you value a
person enough to put her on the team, then you should value that person’s The assumption that key talent will be available is a dangerous one.
opinion, even if it disagrees with yours. According to Anuj Dhawan, Founder of Ridenest, an app targeted to ensuring
women safety on public places, quality talent pool becomes a challenge. “The
VC-funded fat paychecks have made getting talent for bootstrapped
Consult Someone with More Expertise: If you don’t feel secure enough in companies quite a struggle. We have spent significant time in curating the
your abilities or in your team’s abilities to make a solid decision, there is people who would like to work with us,” he shared.
nothing wrong with getting help from someone outside of the company. For
example, a graphics design firm might consult with an attorney to look over a
new contract, or an attorney might hire an accountant to assist with the Profitability
financial statements for the law firm. No one is an expert at everything, and
good decision making includes knowing when to seek additional assistance.

21
Profitability is the ambition of every entrepreneur. However, the assumption
must be validated by market research, financial planning and sales projections
as sales is not the only factor determining profitability.

“After calculating the development and overhead costs, reassess the price to
pay off your start-up costs and then start thinking of profit. Either decide on a
pricing strategy to create high sales volumes by selling at a low price or
capitalize on profit margins with a higher price,” he explained.

Management Expertise

Products are not created automatically, and companies do not run themselves.
Proceeding on a plan that the founders alone can run a business profitably
leads invariably to disappointment. According to R. K. Agrawal, an
independent business consultant who was the Managing Partner in S. R. Step4: Put the values of corners in the qbjective function
Batliboi & Co., the creators can make the brightest of products or gadgets that
exist in the market but that doesn’t mean they are armed with organizing,
accounting, marketing, finance, legal, tax and other skills required to run a
business.

“A business plan should lay bare that the founders while excelling in
product making or service delivery, have planned for resources to manage
other verticals of their business,” added Agarwal, who has over 40
exposures in industries, including Steel, Paper, Cement, Automobiles,
Textile, Milk & Dairy Products, etc. both in India and abroad.

2. Linear programming problem: Formulation

Example Step5: Select the required value from the table of values and use the calculated
values of variables as a solution.
(Manufacturing problem) A manufacturing company makes two models A
and B of a product. Each piece of Model A requires 9 labour hours for 2. Linear programming problem: Formulation
fabricating and 1 labour hour for finishing. Each piece of Model B requires 12
labour hours for fabricating and 3 labour hours for finishing. For fabricating
and finishing, the maximum labour hours available are 180 and 30 Mathematical formulation of a linear programming problem:
respectively. The company makes a profit of Rs 8000 on each piece of model
The procedure for mathematical formulation of a linear programming problem
A and Rs 12000 on each piece of Model B. How many pieces of Model A and
consists of the following steps.
Model B should be manufactured per week to realise a maximum profit? What
is the maximum profit per week?
(i) Identify the decision variables.
Solution
(ii) Identify the objective function to be maximized or minimized and express
it as a linear function of decision variables.
Step1: Formulation of objective function which is to be maximised or
minimised
(iii) Identify the set of constraint conditions and express them as linear
inequalities or equations in terms of the decision variables.
Total profit (in Rs) = 8000 x + 12000 y, Let Z = 8000 x + 12000 y

Example 10.1
Step2: Framming of constraints

A furniture dealer deals only two items viz., tables and chairs. He has to invest
9x + 12y ≤ 180 (Fabricating constraint) i.e. 3x + 4y ≤ 60 … (2)
Rs.10,000/- and a space to store atmost 60 pieces. A table cost him Rs.500/–
and a chair Rs.200/–. He can sell all the items that he buys. He is getting a
x + 3y ≤ 30 (Finishing constraint) … (3) profit of Rs.50 per table and Rs.15 per chair. Formulate this problem as an
LPP, so as to maximize the profit.
x ≥ 0, y ≥ 0 (non-negative constraint) … (4)
Solution:
Step3: Plot the constraint equations on the graph and shade the common
region which satisfy the individual equations (i) Variables: Let x1 and x2 denote the number of tables and chairs
respectively.

(i) Objective function:

Profit on x1 tables = 50 x1

Profit on x2 chairs = 15 x2

22
Total profit = 50 x1+15x2 (ii) Objective function: Profit on x1 units of the product P1 = 20 x1

Let Z = 50 x1+15 x2, which is the objective function. Profit on x2 units of the product P2 = 25 x2

Since the total profit is to be maximized, we have to maximize Z=50x1+15x2 Profit on x3 units of the product P3 = 15 x3

(iii) Constraints: Total profit = 20 x1 +25 x2 +15 x3

The dealer has a space to store atmost 60 pieces Since the total profit is to be maximized, we have to maximize Z= 20 x1+25
x2 +15 x3
x1+x2 ≤ 60
Constraints:
The cost of x1 tables = Rs.500 x1
6x1 +3 x2 +12 x3 ≤ 200
The cost of x2 tables = Rs. 200 x2
2 x1 +5 x2 +4 x3 ≤ 350
Total cost = 500 x1 + 200 x2, which cannot be more than 10000
x1 +2 x2 + x3 ≤ 100
500 x1 + 200 x2 ≤ 10000
Non-negative restrictions: Since the number of units of the products A, B
and C cannot be negative, we have x1, x2, x3 ≥ 0
5x1+ 2x2 ≤ 100

Thus, we have the following linear programming model.

(iv) Non-negative restrictions: Since the number of tables and chairs cannot
be negative, we have x1 ≥0, x2 ≥0
Maximize Z = 20 x1 +25 x2 +15 x3
Thus, the mathematical formulation of the LPP is
Subject to
Maximize Z=50 x1+15x2
6 x1 +3 x2 +12 x3≤ 200
Subject to the constrains
2x1+5x2 +4 x3 ≤ 350
x1+x2 ≤60
x1+2 x2 + x3≤ 100
5x1+2x2≤100
x1, x2, x3 ≥ 0
x1, x2≥0
Example 10.3
Example 10.2
A dietician wishes to mix two types of food F1 and F2 in such a way that the
vitamin contents of the mixture contain atleast 6units of vitamin A and 9 units
A company is producing three products P1, P2 and P3, with profit
of vitamin B. Food F1 costs Rs.50 per kg and F2 costs Rs 70 per kg. Food F1
contribution of Rs.20, Rs.25 and Rs.15 per unit respectively. The resource
contains 4 units per kg of vitamin A and 6 units per kg of vitamin B while
requirements per unit of each of the products and total availability are given
food F2 contains 5 units per kg of vitamin A and 3 units per kg of vitamin B.
below.
Formulate the above problem as a linear programming problem to minimize
the cost of mixture.

Solution:

(i) Variables: Let the mixture contains x1 kg of food F1 and x2 kg of food F2

(ii) Objective function:

cost of x1 kg of food F1 = 50 x1

Formulate the above as a linear programming model. cost of x2 kg of food F2 = 70x2

Solution: The cost is to be minimized

(i) Variables: Let x1, x2 and x3 be the number of units of products P1, P2 and Therefore, minimize Z= 50 x1+70x2
P3 to be produced.
(iii) Constraints:

23
We make the following table from the given Objective function: Minimize Z = 600 x1 + 400 x2
data
(ii) Constraints: 3000 x1 + 1000 x2 ≥ 24000(since there is a demand of
24000 bottles of drink A, production should not be less than 24000)

1000x1 + 1000 x2 ≥ 16000

2000x1+ 6000 x2 ≥ 48000

(iii) Non-negative restrictions: Since be the number of days required of a

firm are non-negative, we have x1, x2 ≥0
(iii)
Thus, we have the following LP model.
4x1+5x2 ≥6 (since the mixture contains ‘atleast 6’ units of vitamin A, we have
the inequality of the type ≥ ) Minimize Z = 600 x1 + 400 x2

6x1+3x2 ≥9(since the mixture contains ‘atleast 9’ units of vitamin B, we have subject to 3000 x1 + 1000 x2 ≥ 24000
the inequality of the type ≥ )

1000 x1 + 1000 x2 ≥ 16000

(iv) Non-negative restrictions:

2000 x1 + 6000 x2 ≥ 48000

Since the number of kgs of vitamin A and vitamin B are non-negative, we
have x1, x2 ≥0
and x1, x2 ≥0
Thus, we have the following linear programming model
3. Methods of solving: Graphical & Simplex
Minimize Z = 50 x1 +70x2
Maximisation Problem
subject to 4 x1+5x2 ≥6
Example
6 x1+3x2 ≥9
Maximize Z = f(x, y) = 3x + 2y
and x1, x2≥0
subject to: 2x + y ≤ 18
Example 10.4
2x + 3y ≤ 42
A soft drink company has two bottling plants C1 and C2. Each plant produces 3x + y ≤ 24
three different soft drinks S1, S2 and S3. The production of the two plants in
number of bottles per day are: x ≥ 0, y ≥ 0
Solution

Step1: Make a change of variables and normalize the sign of the independent
terms.

x becomes X1, y becomes X2

Step2: The inequalities become equations by adding slack, surplus and

artificial variables as the following

A market survey indicates that during the month of April there will be a
demand for 24000 bottles of S1, 16000 bottles of S2 and 48000 bottles of S3.
The operating costs, per day, of running plants C1 and C2 are respectively
Rs.600 and Rs.400. How many days should the firm run each plant in April so
that the production cost is minimized while still meeting the market demand?
Formulate the above as a linear programming model.

Solution: table:

(i) Variables: Let x1 be the number of days required to run plant C1 and x2 In this case, a slack variable (X3, X4 and X5) is introduced in each of the
be the number of days required to run plant C2 restrictions of ≤ type, to convert them into equalities, resulting the system of
linear equations:

24
2·X1 + X2 + X3 = 18 Once obtained the input base variable, the output base variable is determined.
The decision is based on a simple calculation: divide each independent term
(P0 column) between the corresponding value in the pivot column, if both
2·X1 + 3·X2 + X4 = 42
values are strictly positive (greater than zero). The row whose result is
minimum score is chosen.
3·X1 + X2 + X5 = 24
If there is any value less than or equal to zero, this quotient will not be
Step3: Reframe the objective Z - 3·X1 - 2·X2 - 0·X3 - 0·X4 - 0·X5 = 0 performed. If all values of the pivot column satisfy this condition, the stop
condition will be reached, and the problem has an unbounded solution (see
Simplex method theory).
Step4: Writing the initial tableau of Simplex method.

In this example: 18/2 [=9], 42/2 [=21] and 24/3 [=8]

The initial tableau of Simplex method consists of all the coefficients of the
decision variables of the original problem and the slack, surplus and artificial
variables added in second step (in columns, with P0 as the constant term and The term of the pivot column which led to the lesser positive quotient in the
Pi as the coefficients of the rest of Xi variables), and constraints (in rows). previous division indicates the row of the slack variable leaving the base. In
The Cb column contains the coefficients of the variables that are in the base. this example, it is X5 (P5), with 3 as coefficient. This row is called pivot row
(in green).
The first row consists of the objective function coefficients, while the last row
contains the objective function value and reduced costs Zj - Cj. If two or more quotients meet the choosing condition (case of tie), other than
that basic variable is chosen (wherever possible).
The last row is calculated as follows: Zj = Σ(Cbi·Pj) for i = 1.m, where if j =
0, P0 = bi and C0 = 0, else Pj = aij. Although this is the first tableau of the The intersection of pivot column and pivot row marks the pivot value, in this
Simplex method and all Cb are null, so the calculation can simplify, and by example
this time Zj = -Cj.
Step6: Update tableau

The new coefficients of the tableau are calculated as follows:

In the pivot row each new value is calculated as: New value = Previous value /
Pivot In the other rows each new value is calculated as: New value = Previous
value - (Previous value in pivot column * New value in pivot row) So the
pivot is normalized (its value becomes 1), while the other values of the pivot
column are canceled (analogous to the Gauss-Jordan method).

Calculations for P4 row are shown below:

Step4: Stopping condition.

If the objective is to maximize, when in the last row (indicator row) there is no
negative value between discounted costs (P1 columns below) the stop
condition is reached.

In that case, the algorithm reaches the end as there is no improvement

possibility. The Z value (P0 column) is the optimal solution of the problem.

Another possible scenario is all values are negative or zero in the input
variable column of the base. This indicates that the problem is not limited, and
the solution will always be improved.

Otherwise, the following steps are executed iteratively.

Step7: When checking the stop condition is observed which is not fulfilled
Step5: Choice of the input and output base variables since there is one negative value in the last row, -1. So, continue the following
steps
First, input base variable is determined. For this, column whose value in Z
row is the lesser of all the negatives is chosen. In this example it would be the 1. The input base variable is X2 (P2), since it is the variable that
variable X1 (P1) with -3 as coefficient. corresponds to the column where the coefficient is -1.
2. To calculate the output base variable, the constant terms P0
If there are two or more equal coefficients satisfying the above condition (case column) are divided by the terms of the new pivot column: 2 / 1/3
of tie), then choice the basic variable. [=6], 26 / 7/3 [=78/7] and 8 / 1/3 [=24]. As the lesser positive
quotient is 6, the output base variable is X3 (P3).
3. The new pivot is 1/3.
The column of the input base variable is called pivot column (in green color).
Updating the values of tableau again is obtained:

25
4. If the optimal solution occurs at more than one extreme point, then
the value of the objective function will be the same for all convex
combinations of these extreme points.

METHOD-1: Extreme Point Enumeration Approach

This solution method for an LP problem is divided into five steps.

Step 1

State the given problem in the mathematical form.

Step8: Checking again the stop condition reveals that the pivot row has one Step 2
negative value, -1. It means that optimal solution is not reached yet, and we
must continue the following steps:
Graph the constraints, by temporarily ignoring the inequality sign and decide
about the area of feasible solutions according to the inequality sign of the
1. The input base variable is X5 (P5), since it is the variable that constraints. Indicate the area of feasible solutions by a shaded area, which
corresponds to the column where the coefficient is -1. forms a convex polyhedron.
2. To calculate the output base variable, the constant terms (P0) are
divided by the terms of the new pivot column: 6/ (-2) [=-3] , 12/4
[=3] , and 6/1 [=6]. In this iteration, the output base variable is X4 Step 3
(P4).
3. The new pivot is 4.
Determine the coordinates of the extreme points of the feasible solution space.

Updating the values of tableau again is obtained:

Step 4

Evaluate the value of the objective function at each extreme point.

Step 5

Determine the extreme point to obtain the optimum (best) value of the
objective function.

Example 1

Use the graphical method to solve the following LPP problem:

Step9: End of algorithm. Maximize z = 15X1 + 10X2

It is noted that in the last row, all the coefficients are positive, so the stop Subject to constraints
condition is fulfilled.
4X1+6X2 <=360
The optimal solution is given by the val-ue of Z in the constant term’s column
(P0 column), in the example: 33. In the same column, the point where it 3X1+0X2<=180
reaches is shown, watching the corresponding rows of input decision
variables: X1 = 3 and X2 = 12.
0X1+5X2 <=200
Undoing the name change gives x = 3 and y = 12.
X1, X2>=0
3. Methods of solving: Graphical & Simplex
Solution
GRAPHICAL SOLUTION PROCEDURE OF LP PROBLEMS
Step 1
While obtaining the optimal solution to the LP problem by the graphical
method, the statement of the following theorems of linear programming is
State the problem in mathematical form.
used.

The problem we have considered here is already in mathematical form [given

1. The collection of all feasible solutions to an LP problem
in terms of mathematical symbols and notations]
constitutes a convex set whose extreme points correspond to the
basic feasible solutions.
2. There are a finite number of basic feasible solutions within the Step 2
feasible solution space.
3. If the convex set of the feasible solutions of the system Ax=b,
Plot the constraints on the graph paper and find the feasible region.
x>=0, is a convex polyhedron, then at least one of the extreme
points gives an optimal solution.
26
We shall treat x1 as the horizontal axis and x2 as the vertical axis. Each Step 3
constraint will be plotted on the graph by treating it as a linear equation and
then appropriate inequality conditions will be used to mark the area of the
Determine the coordinates of extreme points
feasible solutions.

Since the optimal value of the objective function occurs at one of the extreme
Consider the first constraint 4X1+6X2<=360. Treated it as the equation,
points of the feasible region, it is necessary to determine their coordinates.
The coordinates of extreme points of the feasible region are:
4X1+6X2 =360.
O = (0, 0),

A= (60, 0)

B= (60, 20),
This equation indicates that when it is plotted, the graph cuts X1 axis
[intercept] at 90 and X2 axis [x2 intercept] at 60. These two points are then
connected by a straight line as shown in figure. C= (30, 40),

But the inequality and non-negativity condition can only be satisfied by the D= (0, 40)
shaded area (feasible region) as shown in figure below:
Step 4

Evaluate the value of the objective at extreme points Now, the next step is to
evaluate the objective function value at each extreme point of the feasible
region, which is shown below:

Similarly, the constraints 3X1<=180 and 5X2<=200 are also plotted on the
same graph, which is shown above and the required region is indicated by the
shaded area, which is common to all the 3 constraint as shown in figure
below:

Step 5

Determining the optimal value of the objective function Since our desire it to
maximize the value of the objective function, Z, therefore from step 4 we can
conclude that maximum value of Z=1100 is achieved at the point B (60, 20).
Hence the optimal solution to the given LP problem is:

X1=60,

X2=20 and

Value of the objective function, Z=1100

Example 2: Use the graphical method to solve the following LP problem

Minimize Z=20X1+10X2

Subject to constraints,
Since all constraints have been graphed, the area which is bounded by all the
constraints lines including all the boundary points is called the feasible region
or solution space. The feasible region is shown in fig by the area OABCD. X1+2X2 ≤ 40

27
3X1+X2 ≥ 30

4X1+3X2 ≥ 60

And X1, X2 ≥ 0

Solution

Step 1

State the problem in mathematical form.

The problem we have considered here is already in mathematical form [given

in terms of mathematical symbols and notations]

Step 2

Plot the constraints on the graph paper and find the feasible region.
Step 3

We shall treat x1 as the horizontal axis and x2 as the vertical axis. Each
Determine the coordinates of extreme points
constraint will be plotted on the graph by treating it as a linear equation and
then appropriate inequality conditions will be used to mark the area of the
feasible solutions. Since the optimal value of the objective function occurs at one of the extreme
points of the feasible region, it is necessary to determine their coordinates.
The coordinates of extreme points of the feasible region are:
The constraints are written in the following form.

A= (15, 0), B= (40, 0), C= (4, 18) and D= (6, 12).

X1/40 + X2/ (40/2) ≤ 1 or

Step 4
X1/40 + X2/20 ≤ 1-----

Evaluate the value of the objective at extreme points Now, the next step is to
(1)
evaluate the objective function value at each extreme point of the feasible
region, which is shown below:
X1/ (30/3) + X2/3 ≥ 1 or

X1/10 + X2/30 ≤ 1----

(2)

X1/ (60/4) + X2/(60/3) ≥ 1 or X1/15 + X2/20 ≥ 1--------------

(3)

Graph each constraint by first treating it as a linear equation. Then use the
inequality condition of each constraint to make the feasible region as shown in
following figure.

Therefore, the minimum value of the objective function Z=240 occurs at the
extreme point D (6, 12). Hence, the optimal solution to the given LP problem
is:

X1 = 6, X2 = 12

And the value of the objective function is

Z = 240.

The Simplex Method

We have already encountered linear programming (see Section 6.6)—the

general problem of optimizing a linear function of several variables subject to
a set of linear constraints:
28
We mentioned there that many important practical problems can be modeled
as instances of linear programming. Two researchers, L. V. Kantorovich of
the former Soviet Union and the Dutch American T. C. Koopmans, were even
awarded the Nobel Prize in 1975 for their contributions to linear programming
theory and its applications to economics. Apparently because there is no
Nobel Prize in mathematics, the Royal Swedish Academy of Sciences failed
to honor the U.S. mathematician G. B. Dantzig, who is universally recognized
as the father of linear programming in its modern form and the inventor of the
simplex method, the classic algorithm for solving such problems.

Geometric Interpretation of Linear Programming

Before we introduce a general method for solving linear programming

with the feasible region—see Figure 10.2—the answer to the posed question is
problems, let us consider a small example, which will help us to see the
no. On the other hand, there are infinitely many feasible points for which the
fundamental prop-erties of such problems.
objective function is equal to, say, 10: they are the intersection points of the
line 3x + 5y = 10 with the feasible region. Note that the lines 3x + 5y = 20 and
EXAMPLE 1 Consider the following linear programming problem in two 3x + 5y = 10 have the same slope, as would any line defined by equation 3x +
variables: 5y = z where z is some constant. Such lines are called level lines of the
objective function. Thus, our problem can be restated as finding the largest
value of the parameter z for which the level line 3x + 5y = z has a common
point with the feasible region.

We can find this line either by shifting, say, the line 3x + 5y = 20 south-west
(without changing its slope!) toward the feasible region until it hits the region
for the first time or by shifting, say, the line 3x + 5y = 10 north-east until it
hits the feasible region for the last time. Either way, it will happen at the point
(3, 1) with the corresponding z value 3. 3 + 5. 1 = 14. This means that the
By definition, a feasible solution to this problem is any point (x, y) that optimal solution to the linear programming problem in question is x = 3, y =
satisfies all the constraints of the problem; the problem’s feasible region is the 1, with the maximal value of the objective function equal to 14.
set of all its feasible points. It is instructive to sketch the feasible region in the
Cartesian plane. Recall that any equation ax + by = c, where coefficients a and
b are not both equal to zero, defines a straight line. Such a line divides the Note that if we had to maximize z = 3x + 3y as the objective function in
plane into two half-planes: for all the points in one of them, ax + by < c, while problem (10.2), the level line 3x + 3y = z for the largest value of z would
for all the points in the other, ax + by > c. (It is easy to determine which of the coincide with the boundary line segment that has the same slope as the level
two half-planes is which: take any point (x0, y0) not on the line ax + by = c lines (draw this line in Figure 10.2). Consequently, all the points of the line
and check which of the two inequalities hold, ax0 + by0 > c or ax0 + by0 < c.) segment between vertices (3, 1) and (4, 0), including the vertices themselves,
In particular, the set of points defined by inequality x + y ≤ 4 comprises the would be optimal solutions, yielding, of course, the same maximal value of
points on and below the line x + y = 4, and the set of points defined by the objective function.
inequality x + 3y ≤ 6 comprises the points on and below the line x + 3y = 6.
Since the points of the feasible region must satisfy all the constraints of the
problem, the feasible region is obtained by the intersection of these two half-
planes and the first quadrant of the Cartesian plane defined by the
nonnegativity constraints x ≥ 0, y ≥ 0 (see Figure 10.1). Thus, the feasible
region for problem (10.2) is the convex polygon with the vertices (0, 0), (4, 0),
(0, 2), and (3, 1). (The last point, which is the point of intersection of the lines
x + y = 4 and x + 3y = 6, is obtained by solving the system of these two linear
equations.) Our task is to find an optimal solution, a point in the feasible
region with the largest value of the objective function z = 3x + 5y.

Are there feasible solutions for which the value of the objective function
equals, say, 20? The points (x, y) for which the objective function z = 3x + 5y
is equal to 20 form the line 3x + 5y = 20. Since this line does not have
common points

Does every linear programming problem have an optimal solution that can be
found at a vertex of its feasible region? Without appropriate qualifications, the
answer to this question is no. To begin with, the feasible region of a linear
programming problem can be empty. For example, if the constraints include
two contradictory requirements, such as x + y ≤ 1 and x + y ≥ 2, there can be
no points in the problem’s feasible region. Linear programming problems with

29
the empty feasible region are called infeasible. Obviously, infeasible problems The idea of this algorithm can be described in geometric terms as follows.
do not have optimal solutions. Start by identifying an extreme point of the feasible region. Then check
whether one can get an improved value of the objective function by going to
an adjacent extreme point. If it is not the case, the current point is optimal—
Another complication may arise if the problem’s feasible region is
stop; if it is the case, proceed to an adjacent extreme point with an improved
unbounded, as the following example demonstrates.
value of the objective function. After a finite number of steps, the algorithm
will either reach an extreme point where an optimal solution occurs or
EXAMPLE 2 If we reverse the inequalities in problem (10.2) to x + y ≥ 4 and determine that no optimal solution exists.
x + 3y ≥ 6, the feasible region of the new problem will become unbounded
(see Figure 10.3). If the feasible region of a linear programming problem is
An Outline of the Simplex Method
unbounded, its objective function may or may not attain a finite optimal value
on it. For example, the problem of maximizing z = 3x + 5y subject to the
constraints x + y ≥ 4, x + 3y ≥ 6, x ≥ 0, y ≥ 0 has no optimal solution, because Our task now is to “translate” the geometric description of the simplex method
there are points in the feasible region making 3x + 5y as large as we wish. into the more algorithmically precise language of algebra. To begin with,
Such problems are called unbounded. On the other hand, the problem of before we can apply the simplex method to a linear programming problem, it
minimizing z = 3x + 5y subject to the same constraints has an optimal solution has to be represented in a special form called the standard form. The standard
(which?). form has the following requirements:

It must be a maximization problem.

All the constraints (except the nonnegativity constraints) must be in the form
of linear equations with nonnegative right-hand sides.

All the variables must be required to be nonnegative.

Thus, the general linear programming problem in standard form with m

constraints and n unknowns (n ≥ m) is

Fortunately, the most important features of the examples we considered above

hold for problems with more than two variables. In particular, a feasible
region of a typical linear programming problem is in many ways similar to
convex polygons in the two-dimensional Cartesian plane. Specifically, it
always has a finite number of vertices, which mathematicians prefer to call
extreme points (see Section 3.3). Furthermore, an optimal solution to a linear Any linear programming problem can be transformed into an equivalent
programming problem can be found at one of the extreme points of its feasible problem in standard form. If an objective function needs to be minimized, it
region. We reiterate these properties in the following theorem. can be replaced by the equivalent problem of maximizing the same objective
function with all its coefficients cj replaced by −cj, j = 1, 2, . . . , n (see
Section 6.6 for a more general discussion of such transformations). If a
THEOREM (Extreme Point Theorem) Any linear programming problem constraint is given as an inequality, it can be replaced by an equivalent
with a nonempty bounded feasible region has an optimal solution; moreover, equation by adding a slack variable representing the difference between the
an op-timal solution can always be found at an extreme point of the problem’s two sides of the original inequality. For example, the two inequalities of
feasible region.2 problem (10.2) can be transformed, respectively, into the following equations:

This theorem implies that to solve a linear programming problem, at least in x + y + u = 4 where u ≥ 0 and x + 3y + v = 6 where v ≥ 0.
the case of a bounded feasible region, we can ignore all but a finite number of
points in its feasible region. In principle, we can solve such a problem by
computing the value of the objective function at each extreme point and Finally, in most linear programming problems, the variables are required to be
selecting the one with the best value. There are two major obstacles to nonnegative to begin with because they represent some physical quantities. If
implementing this plan, however. The first lies in the need for a mechanism this is not the case in an initial statement of a problem, an unconstrained
for generating the extreme points of the feasible region. As we are going to variable
see below, a rather straightforward algebraic procedure for this task has been
discovered. The second obstacle lies in the number of extreme points a typical
xj can be replaced by the difference between two new nonnegative variables:
feasible region has. Here, the news is bad: the number of extreme points is
xj = x'j − x''j, x'j ≥ 0, x''j ≥ 0.
known to grow exponentially with the size of the problem. This makes the
exhaustive inspection of extreme points unrealistic for most linear
programming problems of nontrivial sizes. Thus, problem (10.2) in standard form is the following linear programming
problem in four variables:
Fortunately, it turns out that there exists an algorithm that typically inspects
only a small fraction of the extreme points of the feasible region before
reaching an optimal one. This famous algorithm is called the simplex method.

30
rows and n + 1 columns. Each of the first m rows of the table contains the
coefficients of a corresponding constraint equation, with the last column’s
entry containing the equation’s right-hand side. The columns, except the last
one, are labeled by the names of the variables. The rows are labeled by the
basic variables of the basic feasible solution the tableau represents; the values
of the basic variables of this solution are in the last column. Also note that the
columns labeled by the basic variables form the m × m identity matrix.

It is easy to see that if we find an optimal solution (x∗, y∗, u∗, v∗) to
problem (10.4), we can obtain an optimal solution to problem (10.2) by The last row of a simplex tableau is called the objective row. It is initialized
simply ignoring its last two coordinates. by the coefficients of the objective function with their signs reversed (in the
first n columns) and the value of the objective function at the initial point (in
the last column). On subsequent iterations, the objective row is transformed
The principal advantage of the standard form lies in the simple mechanism it the same way as all the other rows. The objective row is used by the simplex
provides for identifying extreme points of the feasible region. To do this for method to check whether the current tableau represents an optimal solution: it
problem (10.4), for example, we need to set two of the four variables in the does if all the entries in the objective row—except, possibly, the one in the
con-straint equations to zero to get a system of two linear equations in two last column—are nonnegative. If this is not the case, any of the negative
unknowns and solve this system. For the general case of a problem with m entries indicates a nonbasic variable that can become basic in the next tableau.
equations in n unknowns (n ≥ m), n − m variables need to be set to zero to get
a system of m equations in m unknowns. If the system obtained has a unique
solution—as any nondegenerate system of linear equations with the number of For example, according to this criterion, the basic feasible solution (0, 0, 4, 6)
equations equal to the number of unknowns does—we have a basic solution; represented by tableau (10.5) is not optimal. The negative value in the x-
its coordinates set to zero before solving the system are called nonbasic, and column signals the fact that we can increase the value of the objective
its coordinates obtained by solving the system are called basic. (This function z = 3x + 5y + 0u + 0v by increasing the value of the x-coordinate in
terminology comes from linear algebra. the current basic feasible solution (0, 0, 4, 6). Indeed, since the coefficient for
x in the objective function is positive, the larger the x value, the larger the
value of this function. Of course, we will need to “compensate” an increase in
Specifically, we can rewrite the system of constraint equations of (10.4) as x by adjusting the values of the basic variables u and v so that the new point is
still feasible. For this to be the case, both conditions

x + u = 4 where u ≥ 0 x + v = 6 where v ≥ 0

must be satisfied, which means that

A basis in the two-dimensional vector space is composed of any two vectors x ≤ min {4, 6} = 4.
that are not proportional to each other; once a basis is chosen, any vector can
be uniquely expressed as a sum of multiples of the basis vectors. Basic and Note that if we increase the value of x from 0 to 4, the largest amount
nonba-sic variables indicate which of the given vectors are, respectively, possible, we will find ourselves at the point (4, 0, 0, 2), an adjacent to (0, 0, 4,
included and excluded in a particular basis choice.) 6) extreme point of the feasible region, with z = 12.

If all the coordinates of a basic solution are nonnegative, the basic solution is Similarly, the negative value in the y-column of the objective row signals the
called a basic feasible solution. For example, if we set to zero variables x and fact that we can also increase the value of the objective function by increasing
y and solve the resulting system for u and v, we obtain the basic feasible the value of the y-coordinate in the initial basic feasible solution (0, 0, 4, 6).
solution (0, 0, 4, 6); if we set to zero variables x and u and solve the resulting This requires
system for y and v, we obtain the basic solution (0, 4, 0, −6), which is not
feasible. The impor-tance of basic feasible solutions lies in the one-to-one
correspondence between them and the extreme points of the feasible region. y + u = 4 where u ≥ 0 3y + v = 6 where v ≥ 0,
For example, (0, 0, 4, 6) is an extreme point of the feasible region of problem
(10.4) (with the point (0, 0) in Fig-ure 10.1 being its projection on the x, y which means that
plane). Incidentally, (0, 0, 4, 6) is a natural starting point for the simplex
method’s application to this problem.

As mentioned above, the simplex method progresses through a series of

adjacent extreme points (basic feasible solutions) with increasing values of the
objective function. Each such point can be represented by a simplex tableau, a
table storing the information about the basic feasible solution corresponding
to the extreme point. For example, the simplex tableau for (0, 0, 4, 6) of
problem (10.4) is presented below: If we increase the value of y from 0 to 2, the largest amount possible, we will
find ourselves at the point (0, 2, 2, 0), another adjacent to (0, 0, 4, 6) extreme
point, with z = 10.

If there are several negative entries in the objective row, a commonly used
rule is to select the most negative one, i.e., the negative number with the
largest absolute value. This rule is motivated by the observation that such a
choice yields the largest increase in the objective function’s value per unit of
change in a vari-able’s value. (In our example, an increase in the x-value from
0 to 1 at (0, 0, 4, 6) changes the value of z = 3x + 5y + 0u + 0v from 0 to 3,
while an increase in the y-value from 0 to 1 at (0, 0, 4, 6) changes z from 0 to
5.) Note, however, that the feasibility constraints impose different limits on
how much each of the variables may increase. In our example, in particular,
the choice of the y-variable over the x-variable leads to a smaller increase in
In general, a simplex tableau for a linear programming problem in standard the value of the objective function. Still, we will employ this commonly used
form with n unknowns and m linear equality constraints (n ≥ m) has m + 1
31
rule and select variable y as we continue with our example. A new basic
variable is called the entering variable, while its column is referred to as the
pivot column; we mark the pivot column by ↑.

Now we will explain how to choose a departing variable, i.e., a basic variable
to become nonbasic in the next tableau. (The total number of basic variables
in any basic solution must be equal to m, the number of the equality
constraints.) As we saw above, to get to an adjacent extreme point with a
larger value of the objective function, we need to increase the entering
variable by the largest amount possible to make one of the old basic variables
zero while preserving the nonnegativity of all the others. We can translate this
observation into the following rule for choosing a departing variable in a
simplex tableau: for each positive entry in the pivot column, compute the θ -
ratio by dividing the row’s last entry by the entry in the pivot column. For the
example of tableau (10.5), these θ -ratios are

The row with the smallest θ -ratio determines the departing variable, i.e., the
variable to become nonbasic. Ties may be broken arbitrarily. For our example,
it is variable v. We mark the row of the departing variable, called the pivot
row, by ←

and denote it row. Note that if there are no positive entries in the pivot
column, no θ -ratio can be computed, which indicates that the problem is This tableau represents the basic feasible solution (3, 1, 0, 0). It is optimal
unbounded, and the algorithm stops. because all the entries in the objective row of tableau (10.7) are nonnegative.
The maximal value of the objective function is equal to 14, the last entry in
Finally, the following steps need to be taken to transform a current tableau the objective row.
into the next one. (This transformation, called pivoting, is similar to the
princi-pal step of the Gauss-Jordan elimination algorithm for solving systems Let us summarize the steps of the simplex method.
of linear equations—see Problem 8 in Exercises 6.2.) First, divide all the
entries of the pivot
Summary of the simplex method

Step 0 Initialization Present a given linear programming problem in stan-dard

form and set up an initial tableau with nonnegative entries in the rightmost
column and m other columns composing the m × m identity matrix. (Entries
in the objective row are to be disregarded in verifying these requirements.)
These m columns define the basic variables of the initial basic feasible
solution, used as the labels of the tableau’s rows.

Step 1 Optimality test If all the entries in the objective row (except, possibly,
the one in the rightmost column, which represents the value of the objective
function) are nonnegative—stop: the tableau represents an optimal solution
whose basic variables’ values are in the rightmost column and the remaining,
nonbasic variables’ values are zeros.

Step 2 Finding the entering variable Select a negative entry from among the
first n elements of the objective row. (A commonly used rule is to select the
negative entry with the largest absolute value, with ties broken arbitrarily.)
Mark its column to indicate the entering variable and the pivot column.

Step 3 Finding the departing variable for each positive entry in the pivot
column, calculate the θ -ratio by dividing that row’s entry in the right-most
Tableau (10.6) represents the basic feasible solution (0, 2, 2, 0) with an column by its entry in the pivot column. (If all the entries in the pivot column
increased value of the objective function, which is equal to 10. It is not are negative or zero, the problem is unbounded—stop.) Find the row with the
optimal, however (why?). smallest θ -ratio (ties may be broken arbitrarily) and mark this row to indicate
the departing variable and the pivot row.
The next iteration—do it yourself as a good exercise! —yields tableau (10.7):
Step 4 Forming the next tableau Divide all the entries in the pivot row by its
entry in the pivot column. Subtract from each of the other rows, including the
objective row, the new pivot row multiplied by the entry in the pivot column
of the row in question. (This will make all the entries in the pivot column 0’s
except for 1 in the pivot row.) Replace the label of the pivot row by the
variable’s name of the pivot column and go back to Step 1.

32
. Make use of dual variable identified in Step 2; write down the constraints for
the dual problem.
3.null. Graphical and Simplex
a. If the primal is a maximization problem, the dual constraints must be all of
‘>=’ type and vice versa.
4. Problems with mixed constraints: Duality, Concept, significance

b. The column coefficients of the primal constraints become the row

Duality – Introduction
coefficients of the dual constraints.
The term dual in the general sense means implies two or double. In the
context of linear programming duality implies that each programming c. The coefficients of the primal objective function become the right-hand side
problem can be analyzed in two different ways but having equivalent constants of the dual constraints.
solutions. Associated with every linear programming problem whether it is
maximization or minimization type there exist another linear programming
problem, which is based upon same data, and having same solution. The d. The dual variables are defined to be unrestricted in sign.
original problem is called the primal problem while the associated one is
called its dual problem. Step 5

The concept of duality is based on the fact that any linear programming must Using steps 3 and 4, write down the dual of the L. P. P
be first put in its standard form before solving the problem by simplex
method. Since all the primal-dual computations are obtained directly from the
simplex table, it is logical that we define the dual that may be constituent with Advantages of duality
the standard form of the primal.
1. It is advantageous to solve the dual of a primal having fewer
The format of the Simplex method is such that solving one type of problem is constraints, because the number of constraints usually equals the
equivalent to solving the other simultaneously. number of iterations required to solve the problem.
2. It also avoids the necessary for adding surplus or artificial
variables and solves the problem quickly (i.e. primal-dual
Thus, if the optimal solution to one is known, then optimal solution of the algorithm). In economics, duality is useful in the formulation of
other can also be read from the final simplex table. the input and output systems.
3. The dual variables provide an important economic interpretation of
In some cases, considerable computing time can be saved by solving the dual. the final solution of an LP problem.
4. It is quite useful when investigating changes in the coefficients or
formulation of an LP problem (i.e. in sensitivity analysis).
The importance of the duality concept is due to two main reasons. 5. Duality is used to solve an LP problem by the Simplex method in
which the initial solution is infeasible (i.e. the dual simplex
1. Firstly, if the primal contains a large number of constraints and a method).
smaller number of variables, converting it to the dual problem and
then solving it can considerably reduce the labor of computation. Example 1
2. Secondly, the interpretation of the dual variables from the cost or
economic point of view proves extremely useful in making future
decisions in the activities being programmed. Write the dual of the following linear programming problem:

Steps In Formulating the Dual Problem Maximize Z=x1-x2+3x3

Various steps involved in formulation of a primal-dual pair are: Subject to the constraints:

Step 1 x1+x2+x3<=10

Put the given linear programming problem into its standard form. Consider it 2x1-0x2-x3<=2
as a primal problem.
2x1-2x2-3x3<=6
Step 2
Where x1, x2, x3>=0
Identify the variables to be used in the dual problem. For each constraint,
assign one dual variable. The number of the variables in the dual problem will Solution
be equal to the number of constraint equations in the primal problem.

If W1, W2, W3 be the dual variable corresponding to the primal constraints.

Step 3

Then, the dual problem will be

1. Write down the objective function of the dual, using the right-hand
side constants of the primal constraints.
2. If the problem is of maximization type, the dual will be a Minimize Z=10W1+2W2+6W3
minimization problem and vice versa.
Subject to the constraints:
Step 4
W1+2W2+2W3>=1

33
W1-2W3>=-1 optimum values of the objective functions of the two problems are
equal is called Fundamental theorem of duality.
4. Each basic feasible solution in primal problem is related to a
W1-W2-3W3>=3
complementary basic feasible solution in dual problem.
5. Complementary slackness theorem:
Where W1, W2, W3 >=0
a) If a primal variable is positive, then the corresponding dual constraint is an
Example 2 equation at the optimum and vice versa.

Max Z=3x1+10x2+2x3 b) If a primal constraint is a strict inequality, then the corresponding dual
variable is zero at the optimum and vice versa.
Subject to
6. If dual has no feasible solution, then primal also admits no feasible
solution.
2x1+3x2+2x3<=7

Duality and Simplex Method

3x1-2x2+4x3=3
If primal is a maximization problem, then following are the set of rules that
And x1, x2, x3 >=0 govern the derivation of the optimum solution:

Solution Rule 1

Given the primal LPP is Corresponding net evaluations of the starting primal variables=Difference
between the left and right sides of the dual constraints associated with the
starting primal variables.
Max Z=3x1+10 x2+2x3
Rule 2
Subject to
Negative of the corresponding net evaluations of the starting dual variables
2x1+3x2+2x3<=7 equal to the difference between the left and right sides of the primal
constraints associated with the starting dual variables.
3x1-2x2+4x3=3
Rule 3
And x1, x2, x3 >=0
If the primal (dual) problem is unbounded, then the dual (primal) problem
Since the primal problem contains 2 constraints and 3 variables, the dual does not have any feasible solution.
problem will contain 3 constraints and 2 dual variables Y1, Y2.
Example 1
Also, since the second constraint in the primal problem is with equality, the
corresponding second dual variable Y2 is unrestricted in sign. Use duality to solve the following L.P.P:

Therefore, the dual problem is Maximize Z=2X1+X2

Min W=7Y1+3Y2 Subject to constraints:

Subject to X1+2X2<=10

2Y1+3Y2>=3 X1+X2<=6

3Y1-2Y2>=10 X1-X2<=2

2Y1+4Y2>=2 X1-2X2<=1

And Y1>=0, & Y2 is unrestricted. X1, X2, X3>=0

Important theorems’ statement used in Primal -Dual context Solution

1. The dual of the dual is primal. The dual problem of the given primal is
2. If either the primal or the dual problem has an unbound objective
function value, then the other problem has no feasible solution is
known as Existence theorem. Minimize Z*=10Y1+6Y2+2Y3+Y4
3. If the primal or the dual has a finite optimum solution, then the
other problem also possess a finite optimum solution and the Subject to constraints:

34
Y1+Y2+Y3+Y4>=2

2Y1+Y2-Y3-2Y4>=1

Y1, Y2, Y3, Y4>=0

Introducing surplus variables s1>=0, s2>=0 and artificial variables A1>=0,

A2>=0, an initial basic feasible solution is A1=2 and A2=1.

The iterative simplex tables are:

Initial Iteration

Introduce y1 and drop

y8.

Thus, an optimum feasible solution to the dual problem is

w1=0, w2=3/2, w3=1/2.

Min Z*= - (- 10) =10 Tags: Operations Management - Introduction to

Operations Research Last 30 days 414 views

Unit - IV - Transportation and Assignment Problems

1. General structure of transportation problem

Transportation Problem

The objective of transportation problem is to determine the amount to be

transported from each origin to each destination such that the total
transportation cost is minimized.

Definition and formulation

35
The Structure of the Problem 2. Different types of methods for findind initial solution by North - West
Corner Rule
Let there be m origins and n destinations. Let the amount of supply at the i th
origin is ai. Let the demand at j th destination is bj. Methods for finding initial solution by North - West Corner Rule:

There are several methods available to obtain an initial basic feasible solution
The cost of transporting one unit of an item from origin i to destination j is cij of a transportation problem. We discuss here only the following three. For
and is known for all combinations (i, j). Quantity transported from origin i to finding the initial basic feasible solution total supply must be equal to total
destination j be xij demand.

The objective is to determine the quantity xij to be transported over all routes
(i, j) so as to minimize the total transportation cost. The supply limits at the
origins and the demand requirements at the destinations must be satisfied.

The above transportation problem can be written in the following tabular

form:

Method: North-West Corner Rule (NWC)

It is a simple method to obtain an initial basic feasible solution. Various steps

involved in this method are summarized below.

Step 1: Choose the cell in the northwest corner of the transportation
Table10.1 and allocate as much as possible in this cell so that either the
capacity of first row (supply)is exhausted or the destination requirement of the
first column(demand) is exhausted. (i.e) x11 = min(a1, b1)

Step 2: If the demand is exhausted (b1 < a1), move one cell right horizontally
tothe second column and allocate as much as possible. (i.e) x12= min (a1 –
x11, b2)

If the supply is exhausted (b1 > a1), move one cell down vertically to the
second row and allocateas much as possible. (i.e)x21 = min (a2,b1 – x11)
Now the linear programming model representing the transportation problem is
given by If both supply and demand are exhausted move one cell diagonally and
allocate as much as possible.

Step 3: Continue the above procedure until all the allocations are made

Example 10.1

Obtain the initial solution for the following problem

Some Definitions

Feasible Solution: A feasible solution to a transportation problem is a set of

non-negative values xij (i=1,2,..,m, j=1,2,…n) that satisfies the constraints.

Basic Feasible Solution: A feasible solution is called a basic feasible solution

if it contains not more than m+n–1 allocation, where m is the number of rows
and n is the number of columns in a transportation problem.

Optimal Solution: Optimal Solution is a feasible solution (not necessarily

basic) which optimizes(minimize) the total transportation cost.

Nondegenerate basic feasible Solution: If a basic feasible solution to a Solution:

transportation problem contains exactly m+n–1 allocations in independent
positions, it is called a Non degenerate basic feasible solution. Here m is the
number of rows and n is the number of columns in a transportation problem. Here total supply = 5+8+7+14=34, Total demand = 7+9+18=34

Degeneracy: If a basic feasible solution to a transportation problem contains (i.e) Total supply =Total demand∴The given problem is balanced
less than m+n–1 allocations , it is called a degenerate basic feasible solution. transportation problem.
Here m is the number of rows and n is the number of columns in a
∴ we can findan initial basic feasible solution to the given problem.
transportation problem.

36
From the above table we can choose the cell in the Northwest Corner. Here
the cell is (1, A)

Allocate as much as possible in this cell so that either the capacity of first row
is exhausted or the destination requirement of the first column is exhausted.

i.e. x11 = min (5,7) = 5

Reduced transportation table is

Here north west corner cell is (2, B) Allocate as much as possible in the first
cell so that either the capacity of second row is exhausted or the destination
requirement of the second column is exhausted.

i.e. x22 = min (6,9) = 6

Now the cell in the Northwest corner is (2, A)

Allocate as much as possible in the first cell so that either the capacity of
second row is exhausted or the destination requirement of the first column is
exhausted.

i.e. x12 = min (2,8) = 2

Reduced transportation table is

37
Here north west corner cell is (3, B).
Reduced transportation table and final allocation is x44 = 14

Allocate as much as possible in the first cell so that either the capacity of third
row is exhausted or the destination requirement of the second column is
exhausted.

i.e. x32 = min (7,3) = 3

Thus, we have the following allocations

Reduced transportation table is

Transportation schedule: 1→ A, 2→A, 2→B, 3→B, 3→C, 4→C

Here north west corner cell is (3, C) Allocate as much as possible in the first The total transportation cost.
cell so that either the capacity of third row is exhausted or the destination
requirement of the third column is exhausted.
= (5 × 2) + (2 × 3) +(6 × 3) +(3 × 4)+(4 × 7) + (14 × 2)
i.e. x33 = min (4,18) = 4
= Rs.102

Example 10.2

Determine an initial basic feasible solution to the following transportation

problem using Northwest corner rule.

38
Here Oi and Dj represent ith origin and jth destination.

Solution:

Given transportation table is

Third Allocation:

Total Availability = Total Requirement

∴ The given problem is balanced transportation problem.

Hence there exists a feasible solution to the given problem. Fourth Allocation:

First allocation:

Fifth allocation:
Second allocation:

39
Step:3: Eliminate the row or column where an allocation is made.

Step:4: Repeat the above steps for the reduced transportation table until all
the allocations are made.

Example 10.3

Obtain an initial basic feasible solution to the following transportation

problem using least cost method.

Final allocation:

Here Oi and Dj denote ith origin and jth destination respectively.

Solution:

Total Supply = Total Demand = 24

∴ The given problem is a balanced transportation problem.

Hence there exists a feasible solution to the given problem.

Given Transportation Problem is:

Transportation schedule: O1→ D1, O1→D2, O2→D2, O2→D3, O3→D3,
O3→D3.

The transportation cost

= (6 × 6) +(8 × 4)+(2 × 9)+(14 × 2)+(1 × 6)+(4 × 2) =Rs.128

3. Least cost method and testing for optimality

Methods of finding initial Basic Feasible Solutions

There are several methods available to obtain an initial basic feasible solution
of a transportation problem. We discuss here only the following three. For
finding the initial basic feasible solution total supply must be equal to total
demand. The least cost is 1 corresponds to the cells (O1, D1) and (O3, D4)

Take the Cell (O1, D1) arbitrarily.

Allocatemin (6,4) = 4 units to this cell.

Method: Least Cost Method (LCM)

The least cost method is more economical than north-west corner rule, since it
starts with a lower beginning cost. Various steps involved in this method are
summarized as under.

Step 1: Find the cell with the least(minimum) cost in the transportation table.

Step 2: Allocate the maximum feasible quantity to this cell.

40
The least costis 2 corresponds to the cells (O1, D2), (O2, D3), (O3, D2), (O3,
D3)

The reduced table is Allocate min (2,6) = 2 units to this cell.

The least cost corresponds to the cell (O3, D4). Allocate min (10,6) = 6 units The reduced table is
to this cell.

The least cost is 2 corresponds to the cells (O2, D3), (O3, D2), (O3, D3)

Allocate min (8,8) = 8 units to this cell.

The reduced table is

41
The reduced table is

Cost is expressed in terms of rupees per unit shipped.

Here allocate 4 units in the cell (O3, D2)
Solution:

Total Capacity = Total Demand

∴ The given problem is balanced transportation problem.

Hence there exists a feasible solution to the given problem.

Given Transportation Problem is

Thus, we have the following allocations:

First Allocation:

Transportation schedule:

O1→ D1, O1→D2, O2→D3, O3→D2, O3→D4

Total transportation cost

= (4×1) + (2×2)+(8×2)+(4×2)+(6×1)

= 4+4+16+8+6

=Rs. 38.

Example 10.4 Second Allocation:

Determine how much quantity should be stepped from factory to various

destinations for the following transportation problem using the least cost
method

42
Third Allocation: Sixth Allocation:

Transportation schedule:
Fourth Allocation:
T→ H, T→P, B→C, B→H, M→H, M→K

The total Transportation cost = (5×8) + (25×5)+ (35×5) + (5×11)+ (18×9) +

(32×7)

= 40+125+175+55+162+224

= ₹ 781

Testing For Optimality

Once the feasible solution is obtained, the next step is to check whether it is
optimum or not. There are two methods used for testing the optimality:

Stepping-stone Method

Modified Distribution Method (MODI)

Fifth Allocation:
STEPPINGSTONE METHOD

The Steppingstone Method is used to check the optimality of the initial

feasible solution determined by using any of the method Viz. North-West
Corner, Least Cost Method or Vogel’s Approximation Method. Thus, the
steppingstone method is a procedure for finding the potential of any non-basic
variables (empty cells) in terms of the objective function.

Through steppingstone method, we determine that what effect on the

transportation cost would be in case one unit is assigned to the empty cell.

43
With the help of this method, we come to know whether the solution is
optimal or not.

The series of steps are involved in checking the optimality of the initial
feasible solution using the steppingstone method:

1. The prerequisite condition to solve for the optimality is to ensure that the
number of occupied cells is exactly equal to m+n-1, where ‘m’ is the number
of rows, while ‘n’ is equal to the number of columns. 2. Firstly, the empty cell
is selected and then the closed path is created which starts from the
unoccupied cell and returns to the same unoccupied cell, called as a “closed
loop”. For creating a closed loop, the following conditions should be kept in
mind:

1. In a closed loop, cells are selected in a sequence such that one cell
is unused/unoccupied, and all other cells are used/occupied.
2. A pair of Consecutive used cells lies either in the same row or the
same column.
3. No three consecutive occupied cells can either be in the same row
or column.
4. The first and last cells in the closed loop lies either in the same row
or column.
5. Only horizontal and vertical movement is allowed.

3.Once the loop is created, assign “+” or “–“sign alternatively on each corner
cell of the loop, but begin with the “+” sign for the unoccupied cell.

4.Repeat these steps again until all the unoccupied cells get evaluated.

5.Now, if all the computed changes are positive or are equal to or greater than
zero, then the optimal solution has been reached.

6.But in case, if any, value comes to be negative, then there is a scope to

reduce the transportation cost further. Then, select that unoccupied cell which
has the most negative change and assign as many units as possible. Subtract
the unit that added to the unoccupied cell from the other cells with a negative
sign in a loop, to balance the demand and supply requirements. With the new matrix so formed, again the empty cells will be evaluated
through a loop formation and signs will be assigned accordingly. The cell with
the highest opportunity cost will be assigned the units, and this process will
Example, suppose the following matrix shows the initial feasible solution and repeat until the best optimum solution is obtained or the opportunity cost of all
steppingstone method is adopted to check its optimality: the unoccupied cells comes to be negative.

MODIFIED DISTRIBUTION METHOD

Definition: The Modified Distribution Method or MODI is an efficient

method of checking the optimality of the initial feasible solution.

The concept of MODI can be further comprehended through an illustration

given below:

1. Initial basic feasible solution is given below:

44
2. Choose the largest positive opportunity cost, which is 7 and draw a
closed path, as shown in the matrix below. Start from the
unoccupied cell and assign “+” or “– “sign alternatively.
Therefore, the most favored cell is BD, assign as many units as
possible.

2. Now, calculate the values of ui and vj by using the equation:

ui+vj = Cij 3. The matrix below shows the maximum allocation to the cell BD,
and that number of units are added to the cell with a positive sign
Substituting the value of u1 as 0 and subtracted from the cell with a negative sign.

U1+V1 = C11, 0+V1 = 6 or V1 = 6

U1 +V2 = C12, 0+V2 = 4 or V2 = 4

U2+V2 = C22, U2+4 = 8 or U2 = 4

U3+ V2 = C32, U3+4 = 4 or U3 = 0

U3+V3 = C33, 0+V3 = 2 or V3 =2

4. Again, repeat the steps from 1 to 4 i.e. find out the opportunity
costs for each unoccupied cell and assign the maximum possible
units to the cell having the largest opportunity cost. This process
will go on until the optimum solution is reached.

The Modified distribution method is an improvement over the steppingstone

method since; it can be applied more efficiently when a large number of
sources and destinations are involved, which becomes quite difficult or
tedious in case of stepping stone method.

1. Next step is to calculate the opportunity cost of the unoccupied Modified distribution method reduces the number of steps involved in the
cells (AF, BD, BF, CD) by using the following formula: evaluation of empty cells, thereby minimizes the complexity and gives a
straightforward computational scheme through which the opportunity cost of
each empty cell can be determined.
Cij – (ui+Vi)

4. Assignment problem
AF = C13 – (U1+V3), 1- (0+2) = -1 or 1

Assignment problems (Hungarian Method)

BD = C21 – (U2+v1), 3- (4+6) = -7 or 7
First check whether the number of rows is equal to the numbers of columns, if
BF = C23 – (U2+V3), 7- (4+2) = 1 or -1 it is so, the assignment problem is said to be balanced.

CD = C31- (U3+V1), 4- (0+6) = -2 or 2 Step :1 Choose the least element in each row and subtract it from all the
elements of that row.

45
Step :2 Choose the least element in each column and subtract it from all the
elements of that column. Step 2 has to be performed from the table obtained in
step 1.

Step:3 Check whether there is atleast one zero in each row and each column
and make an assignment as follows.

(i) Examine the rows successively until a row with exactly one zero is found.
Mark that zero by, that means an assignment is made there . Cross (×) all
other zeros in its column. Continue this until all the rows have been examined.
Since each row and column contains atleast one zero, assignments can be
made.
(ii) Examine the columns successively until a column with exactly one zero is
found. Mark that zero by, that means an assignment is made there . Cross (× )
all other zeros in its row. Continue this until all the columns have been Step 3 (Assignment):
examined
Examine the rows with exactly one zero. First three rows contain more than
Step :4 If each row and each column contain exactly one assignment, then the one zero. Go to row D. There is exactly one zero. Mark that zero by (i.e) job
solution is optimal. D is assigned to machine I. Mark other zeros in its column by ×.

Example 10.7

Solve the following assignment problem. Cell values represent cost of

assigning job A, B, C and D to the machines I, II, III and IV.

Step 4: Now examine the columns with exactly one zero. Already there is an
assignment in column I. Go to the column II. There is exactly one zero. Mark
that zero by. Mark other zeros in its rowby ×.

Solution:

Here the number of rows and columns are equal.

∴ The given assignment problem is balanced. Now let us find the solution.

Step 1: Select the smallest element in each row and subtract this from all the
elements in its row. Column III contains more than one zero. Therefore, proceed to Column IV,
there is exactly one zero. Mark that zero by. Mark other zeros in its row by ×.

Look for atleast one zero in each row and each column. Otherwise go to step
2.
Step 5: Again, examine the rows. Row B contains exactly one zero. Mark that
Step 2: Select the smallest element in each column and subtract this from all zero by.
the elements in its column.

46
Thus, all the four assignments have been made. The optimal assignment
schedule and total cost is

Column 3 contains no zero. Go to Step 2.

Step 2: Select the smallest element in each column and subtract this from all
the elements in its column.

The optimal assignment (minimum) cost

= ₹ 38

Example 10.8

Consider the problem of assigning five jobs to five persons. The assignment
costs are given as follows. Determine the optimum assignment schedule.

Since each row and column contains atleast one zero, assignments can be
made.

Step 3 (Assignment):

Examine the rows with exactly one zero. Row B contains exactly one zero.
Solution: Mark that zero by (i.e) PersonB is assigned to Job 1. Mark other zeros in its
column by ×.
Here the number of rows and columns are equal.

∴ The given assignment problem is balanced.

Now let us find the solution.

Step 1: Select the smallest element in each row and subtract this from all the
elements in its row.

The cost matrix of the given assignment problem is

47
Now, Row C contains exactly one zero. Mark that zero by. Mark other zeros
in its column by ×.

There is an assignment in Column 3 and column 4. Go to Column 5. There is

exactly one zero. Mark that zero by. Mark other zeros in its row by ×.

Now, Row D contains exactly one zero. Mark that zero by. Mark other zeros
in its column by ×.

Thus, all the five assignments have been made. The Optimal assignment
schedule and total cost is

Row E contains more than one zero, now proceed column wise. In column 1,
there is an assignment. Go to column 2. There is exactly one zero. Mark that
zero by. Mark other zeros in its row by ×.

The optimal assignment (minimum) cost = ` 9

Example 10.9

Solve the following assignment problem.

48
Solution:

Since the number of columns is less than the number of rows, given
assignment problem is unbalanced one. To balance it, introduce a dummy
column with all the entries zero. The revised assignment problem is Since each row and each columncontains exactly one assignment, all the three
men have been assigned a task. But task S is not assigned to any Man. The
optimal assignment schedule and total cost is

Here only 3 tasks can be assigned to 3 men.

Step 1: is not necessary, since each row contains zero entry. Go to Step 2. The optimal assignment (minimum) cost = ₹ 35

Step 2: 4.1. Hungarian assignment method

Steps of Hungarian assignment method

1. Checking wheather the problem is balanced or not.

2. In each row we select minimum element and subtract same from
all elements of row and do then from column.
3. If number of assignments is equal to the number of order of cost
material, optimum solution is obtained, but if not, we go to the
next step.
4. Draw minimum number of horizontal and vertical lines to cover all
0s of cost matrix using-

a. Mark [tick] row with no assignment of 0s.

b. Mark column that has 0.

c. Mark row that has assigned 0 in marked column repeat step 2 and 3 intil
chain of marking is complete.
Step 3 (Assignment):
d. Draw line through all unmarked rows and columns, this gives us the desired
minimum number of lines.

e. Find the smallest element of reduced cost matrix not covered by any of
lines and subtract this element from all uncovered by any of lines and add
same to all element lying at intersection of any 2 lines.

4.2. Unbalanced assignment problem

Unbalanced assignment problem

49
1. An Assignment problem is known as balanced one, if the number
of rows equals to the number of columns.
2. When the number of rows is not equal to number of columns, the
assignment models are called as unbalanced assignment problems.
3. In case, if the given problem is unbalanced one, you have to add
appropriate number of rows or columns with zero as assignment
cost [dummy row or dummy column] and make the problem as
balanced assignment model.

Exercise-1

A university wants to allocate the four subjects, and six teachers claim that
they have the required competencies!! /Knowledge!! to teach all the subjects.
The dean believes that the failure in the course is reflection of faculty
member’s performance! Allocate the subject to appropriate faculty members.

1. The first straight line is drawn at Column-5 and second is at

Column-6, which are having six zeros each.
2. The next line is drawn at Row-3
3. Fourth line is drawn at Row-1; fifth at Row-4; the last line at Row-
5
4. Since the number of straight lines equal to number of
rows/columns, optimum number of zeros is available to make the
final allocation.

Solution

You may notice that the number of rows (6) not equal to the number of
columns (4); thus, the given problem is an example of unbalanced assignment
models. We add two dummy columns (2 dummy subjects) with zero as
number failures. The modified problem becomes a balanced one and the initial
table is given below.

The next step is to consider the zeros and making the allocations. Since every
row is having zeros; we do the column wise search;

1. Column-1, Column-3 & Column-4 are having unique zero, we

mark them for allocation and cross out the respective rows in
which the zero appear.
2. In the process, we get unique zero in Column-2 and strike out the
Row-5
3. Now we have tie at Row-2 and Row-6, which can be broken
arbitrarily.
4. We allocate Row-2 to Column 5 (E) and Row-6 to Column-6 (F).

Now we can use Hungarian algorithm, which requires a balanced one to

apply. The following table is obtained after column reduction, since row
reduction will give the same table above.

50
Reducing the matrix row-wise

4.3. Restrictions in assignment Reducing the matrix column-wise

Restrictions in assignment

It is sometimes possible that a particular person is incapable of doing certain

work or a specific job cannot be performed on a particular machine. The
solution of the assignment problem should take into account these restrictions
so that the restricted (infeasible) assignment can be avoided. This can be
achieved by assigning a very high cost (say ∞ or M) to the cells where
assignments are prohibited, thereby restricting the entry of this pair of job-
machine or resource-activity into the final solution.

Example: Five jobs are to be assigned to five men. The cost (in Rs.) of
performing the jobs by each man is given in the matrix. The assignment has
restrictions that Job 4 cannot be performed by Man 1 and Job 3 cannot be
performed by Man 4 Find the optimal assignment of job and its cost involved.

Assignment Problem
Draw minimum number of lines to cover all zeros, see Table.

All Zeros Covered

Solution: Assign large value to the restricted combinations or introduce ‘M’,

see table.

Large Value Assignment to Restricted Combinations

Now, number of lines drawn = Order of matrix, hence optimality is reached.

Allocating Jobs to Men.

51
Job Allocation to Men Second, because our graph is undirected, we can generate only tours in which
b is visited before c. In addition, after visiting n-1 = 4 cities, a tour has no
choice but to visit the remaining unvisited city and return to the starting one.
The state-space tree tracing the algorithm ‘s application is given in the above
figure (b).

The comments we made at the end of the preceding section about the
strengths and weaknesses of backtracking are applicable to branch-and-bound
as well. To reiterate the main point: these state-space tree techniques enable us
to solve many large instances of difficult combinatorial problems.

As a rule, however, it is virtually impossible to predict which instances will be

solvable in a realistic amount of time and which will not.

Incorporation of additional information, such as symmetry of a game ‘s board,

Assignment Schedule and Cost can widen the range of solvable instances. Along this line, a branch-and-
bound algorithm can be sometimes accelerated by knowledge of the objective
function ‘s value of some nontrivial feasible solution.

The information might be obtainable—say, by exploiting specifics of the data

or even, for some problems, generated randomly—before we start developing
a state-space tree. Then we can use such a solution immediately as the best
one seen so far rather than waiting for the branch-and-bound processing to
lead us to the first feasible solution.

In contrast to backtracking, solving a problem by branch-and-bound ha both

the challenge and opportunity of choosing an order of node generation and
finding a good bounding function.
4.4. Travelling salesman problem
Though the best-first rule we used above is a sensible approach, its mayor
TRAVELING SALESMAN PROBLEM may not lead to a solution faster than other strategies.

We will be able to apply the dynamic programming technique to instances of

the traveling salesman problem, if we come up with a reasonable lower bound
on tour lengths.

One very simple lower bound can be obtained by finding the smallest element
in the intercity distance matrix D and multiplying it by the number of cities it.
But there is a less obvious and more informative lower bound, which does not
require a lot of work to compute.

It is not difficult to show that we can compute a lower bound on the length I
of any tour as follows.

For each city i, 1 ≤ i ≤ n, find the sum si of the distances from city i to the two
nearest cities; compute the sums of these n numbers; divide the result by 2,
and it all the distances are integers, round up the result to the nearest integer:

lb=|S/2|
b) State-space tree of the branch-and- bound algorithm applied to this
graph.

(The list of vertices in a node specifies a beginning part of the Hamiltonian

circuits represented by the node.)

For example, for the instance of the above figure, formula yields

We now apply the branch and bound algorithm, with the bounding function
given by formula, to find the shortest Hamiltonian circuit for the graph of the
above figure (a). To reduce the amount of potential work, we take advantage
of two observations.

First, without loss of generality, we can consider only tours that start at a.

Research Methodology
No ratings yet
Research Methodology
6 pages
Sampling Techniques Notes
No ratings yet
Sampling Techniques Notes
16 pages
Data Process Improvement: Data Collection Is A Term Used To Describe A Process of Preparing and
No ratings yet
Data Process Improvement: Data Collection Is A Term Used To Describe A Process of Preparing and
8 pages
Statistics - Chapter 2
No ratings yet
Statistics - Chapter 2
13 pages
CH 2 Collection of Data
No ratings yet
CH 2 Collection of Data
40 pages
Collection of Data
No ratings yet
Collection of Data
25 pages
Collection of Data
No ratings yet
Collection of Data
5 pages
Data Collection
No ratings yet
Data Collection
20 pages
I Am Sharing 'Theory Question of Business Statistics BBA 2nd Sem' With You
No ratings yet
I Am Sharing 'Theory Question of Business Statistics BBA 2nd Sem' With You
3 pages
Data Collection Methods Guide
100% (1)
Data Collection Methods Guide
8 pages
Statistics Chapter 2
No ratings yet
Statistics Chapter 2
8 pages
Unit - 3
No ratings yet
Unit - 3
18 pages
18bzo64e U3
No ratings yet
18bzo64e U3
24 pages
What Is Data Collection
No ratings yet
What Is Data Collection
23 pages
SLR Project - Primary and Secondary Data
100% (1)
SLR Project - Primary and Secondary Data
12 pages
Biostatistics Assignment
No ratings yet
Biostatistics Assignment
22 pages
Assignmnet - Data Collection Methods in Legal Research
No ratings yet
Assignmnet - Data Collection Methods in Legal Research
11 pages
Collection of Data
No ratings yet
Collection of Data
11 pages
Project Report 01.30am
No ratings yet
Project Report 01.30am
18 pages
Introduction of Data Collection
No ratings yet
Introduction of Data Collection
18 pages
MR UNIT-2 Imp (BBA-II)
No ratings yet
MR UNIT-2 Imp (BBA-II)
12 pages
RNDNV CA3
No ratings yet
RNDNV CA3
17 pages
Data Collection and Sources of Data
No ratings yet
Data Collection and Sources of Data
22 pages
Collection of Data
No ratings yet
Collection of Data
13 pages
Sources of Data in Statistics
No ratings yet
Sources of Data in Statistics
30 pages
Statch2collection
No ratings yet
Statch2collection
7 pages
Collection of Data
100% (2)
Collection of Data
18 pages
Unit-I Basic Statistics
No ratings yet
Unit-I Basic Statistics
14 pages
Data Collection Methods Guide
No ratings yet
Data Collection Methods Guide
13 pages
Unit 1
No ratings yet
Unit 1
6 pages
Collection of Data
No ratings yet
Collection of Data
27 pages
Unit Ii
No ratings yet
Unit Ii
47 pages
Data Collection Methods
No ratings yet
Data Collection Methods
31 pages
Primary & Secondary Data Collection
No ratings yet
Primary & Secondary Data Collection
18 pages
Economics Class 11 Unit 15 Collection of DATA
No ratings yet
Economics Class 11 Unit 15 Collection of DATA
0 pages
Data Collection Methods Guide
100% (3)
Data Collection Methods Guide
3 pages
Data Collection
No ratings yet
Data Collection
11 pages
Unit-5 Research Methodology
No ratings yet
Unit-5 Research Methodology
19 pages
Chapter 2 - Collection of Data (Statistics - Economics)
No ratings yet
Chapter 2 - Collection of Data (Statistics - Economics)
7 pages
1 - Methods of Data Collection-Primary and Secondary Sources
No ratings yet
1 - Methods of Data Collection-Primary and Secondary Sources
22 pages
Mmethods of Data Collection, Mba-Ii
No ratings yet
Mmethods of Data Collection, Mba-Ii
5 pages
ZOO-3204 - A-07. Methods of Data Collection
No ratings yet
ZOO-3204 - A-07. Methods of Data Collection
5 pages
Collection of Data - Class XI 2024-25 - 22102024 - 213505
No ratings yet
Collection of Data - Class XI 2024-25 - 22102024 - 213505
16 pages
Research Methods Short Note
No ratings yet
Research Methods Short Note
7 pages
Neeraja
No ratings yet
Neeraja
20 pages
4TH Sem Stat Bit 1
No ratings yet
4TH Sem Stat Bit 1
2 pages
Methods of Collecting Data
No ratings yet
Methods of Collecting Data
16 pages
Data Collection & Presentation Guide
No ratings yet
Data Collection & Presentation Guide
55 pages
Statistics
No ratings yet
Statistics
15 pages
PPT9
No ratings yet
PPT9
21 pages
CHAPTER TWO Stat I
No ratings yet
CHAPTER TWO Stat I
25 pages
Unit Iii - RM
No ratings yet
Unit Iii - RM
29 pages
Module 6 - Methods of Data Collection
No ratings yet
Module 6 - Methods of Data Collection
17 pages
Collection of Data
No ratings yet
Collection of Data
28 pages
CHAPTER FIVE: Sources and Methods of Data Collections 1.1 Data Collection
No ratings yet
CHAPTER FIVE: Sources and Methods of Data Collections 1.1 Data Collection
14 pages
Introduction of Collection of Data
No ratings yet
Introduction of Collection of Data
51 pages
6 Tools of Quantative Research Assignment 2
No ratings yet
6 Tools of Quantative Research Assignment 2
6 pages
Aptitude
No ratings yet
Aptitude
2 pages
Nodal Officer Nomination
No ratings yet
Nodal Officer Nomination
1 page
Matrices
No ratings yet
Matrices
34 pages
Iitm News Paper Ad - LMC
No ratings yet
Iitm News Paper Ad - LMC
1 page
El Hanafy 2022 ApJ 940 51
No ratings yet
El Hanafy 2022 ApJ 940 51
14 pages
Cuet Reel
No ratings yet
Cuet Reel
1 page
CLASS 8 CIVIS CH-1 - Extra Questions + NCERT Questions
No ratings yet
CLASS 8 CIVIS CH-1 - Extra Questions + NCERT Questions
11 pages
Federalism Test
No ratings yet
Federalism Test
1 page
Pandas Notes
No ratings yet
Pandas Notes
19 pages
Tissues Class 9 Test
No ratings yet
Tissues Class 9 Test
1 page
X Maths Ncert
No ratings yet
X Maths Ncert
236 pages
27 May
No ratings yet
27 May
1 page
Our Environment
No ratings yet
Our Environment
15 pages
Break Even
75% (4)
Break Even
44 pages
Info-Tech Pricing Proposal
No ratings yet
Info-Tech Pricing Proposal
4 pages
ResearchMethods - 2015-Research Methods For Economics and Related Studies PDF
No ratings yet
ResearchMethods - 2015-Research Methods For Economics and Related Studies PDF
186 pages
Hola Kola Case Write Up Hola Kola Case Write Up
100% (1)
Hola Kola Case Write Up Hola Kola Case Write Up
5 pages
Acronyms, Abbreviations and Notations Used in Economics
No ratings yet
Acronyms, Abbreviations and Notations Used in Economics
8 pages
Investment Analysis Basics
No ratings yet
Investment Analysis Basics
48 pages
Comprehensive Bid Evaluation Guide
No ratings yet
Comprehensive Bid Evaluation Guide
21 pages
Chap 1
No ratings yet
Chap 1
2 pages
Module 1-3 ACCTG 201
No ratings yet
Module 1-3 ACCTG 201
32 pages
Economy: Building Economics Assignment 2
No ratings yet
Economy: Building Economics Assignment 2
18 pages
Week 7
No ratings yet
Week 7
20 pages
CAT Quantitative Aptitude Practice
No ratings yet
CAT Quantitative Aptitude Practice
10 pages
Studio Sample Computation
No ratings yet
Studio Sample Computation
2 pages
15510870071367769573supply Chain Assessment 2
No ratings yet
15510870071367769573supply Chain Assessment 2
2 pages
Bcom 5 Sem Cost Accounting 1 21100106 Feb 2021
No ratings yet
Bcom 5 Sem Cost Accounting 1 21100106 Feb 2021
4 pages
Case Study - Coca Cola A Transnational Corporation
No ratings yet
Case Study - Coca Cola A Transnational Corporation
2 pages
Kertas Kerja Konsolidasi Inventory
No ratings yet
Kertas Kerja Konsolidasi Inventory
3 pages
GOLD
No ratings yet
GOLD
5 pages
Willem Buiter - Central Banks As Fiscal Players - The Drivers of Fiscal and Monetary Policy Space Book (2020)
100% (2)
Willem Buiter - Central Banks As Fiscal Players - The Drivers of Fiscal and Monetary Policy Space Book (2020)
241 pages
Keown Chapter 8
No ratings yet
Keown Chapter 8
33 pages
Mining Materials Management
No ratings yet
Mining Materials Management
20 pages
Manual English Version 2011: Global Management Challenge
No ratings yet
Manual English Version 2011: Global Management Challenge
63 pages
Vehicle Sales Agreement
88% (8)
Vehicle Sales Agreement
2 pages
Applied Economics
No ratings yet
Applied Economics
7 pages
Environmental & Resource Economics
No ratings yet
Environmental & Resource Economics
16 pages
09 Chapter 1
No ratings yet
09 Chapter 1
149 pages
Aviva UK: Insurance Excellence in UK GI
No ratings yet
Aviva UK: Insurance Excellence in UK GI
98 pages
Transfer Duty Botswana
0% (1)
Transfer Duty Botswana
8 pages
Benjamin Grahams Stock Valuation Equation
No ratings yet
Benjamin Grahams Stock Valuation Equation
8 pages
Merchandising Income Statements
100% (1)
Merchandising Income Statements
7 pages