[go: up one dir, main page]

0% found this document useful (0 votes)
366 views56 pages

Trip Generation Analysis

The document discusses multiple linear regression analysis for trip generation. It begins by presenting the basic multiple linear regression equation and how it estimates coefficients to minimize error. It then discusses assumptions, disadvantages, and interpreting regression results. Finally, it outlines the stepwise approach to developing and selecting the best regression equation, including developing correlation matrices, testing coefficients, and choosing the final model.

Uploaded by

Sourab Vokkalkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
366 views56 pages

Trip Generation Analysis

The document discusses multiple linear regression analysis for trip generation. It begins by presenting the basic multiple linear regression equation and how it estimates coefficients to minimize error. It then discusses assumptions, disadvantages, and interpreting regression results. Finally, it outlines the stepwise approach to developing and selecting the best regression equation, including developing correlation matrices, testing coefficients, and choosing the final model.

Uploaded by

Sourab Vokkalkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

Trip Generation Analysis

Multiple Linear Regression


ෝ = 𝜷𝟏 𝒙𝟏 + 𝜷𝟐 𝒙𝟐 + ⋯ + 𝜷𝒑 𝒙𝒑
𝒚

 s are estimated by minimising the Sum of Squared Errors


(SSE) between observed y values and estimated 𝑦ො values
𝒏

ෝ 𝟐
𝐢. 𝐞., 𝐦𝐢𝐧 ෍ 𝒚 − 𝒚
𝟏
Matrix notation
x11 x12 . . x1p
y1 1
X= x21 x22 . . x2p
y2
 = 2 . . . . .
y= . .
. . . . .
yn p xn1 xn2 . . xnp
Assumptions in Multiple linear regression
analysis
• All the variables are independent of each other

• All the variables are normally distributed

• All the variables are continuous

• A linear relationship exists between the dependent


variable and independent variables

• Influence of independent variable is additive, that is the


inclusion of each variable in the equation contributes a
distinct portion of the trip numbers
Disadvantages of regression analysis
• The equation derived is purely empirical in nature

• Correlation among independent (explanatory) variables


may create estimation problems

• By using zonal averages, important socioeconomic


variations within the zone may be obscured or may yield
spurious results

• The assumptions of linearity and additive impacts on trip


generation may be wrong

• “Best fit” equations may yield counterintuitive results


Regression analysis
• Two independent variables: The general form of the
regression equation for two variable case,
Ye = a + b1X1 + b2X2
Where,
(σ 𝑥2 2 ) σ 𝑥1 𝑦 − σ 𝑥1 𝑥2 σ 𝑥2 𝑦
𝑏1 =
(σ 𝑥1 2 ) σ 𝑥2 2 − σ 𝑥1 𝑥2 2

(σ 𝑥1 2 ) σ 𝑥2 𝑦 − σ 𝑥1 𝑥2 σ 𝑥1 𝑦
𝑏1 =
(σ 𝑥1 2 ) σ 𝑥2 2 − σ 𝑥1 𝑥2 2
Regression analysis
σ 𝑦𝑒 2 σ 𝑦𝑑 2
𝑅2 = σ 𝑦2
𝑆𝑒 =
(𝑛−3)

𝑆𝑒 2 𝑆𝑒 2
𝑆𝑏1 = σ 𝑥1 2 (1−𝑟12 2 )
𝑆𝑏2 = σ 𝑥2 2 (1−𝑟12 2 )

• The coefficient of correlation between any two


variables x and y can be calculated as:
Trip attraction analysis

• Land-use characteristics influence trip attraction rate in


urban areas.

• Common types of land-uses are:


– Residential
– Commercial
– Industrial
– Institutional
– Recreational
Causal variables for trip attraction
• Retail trade, service and office floor area

• Manufacturing and wholesale floor area

• Number of employment opportunities in retail trade, in


service and office, and in manufacturing and wholesale

• School and college enrollment

• Number of special activity centers, like transport


terminals, sports stadium, major cultural or recreational
centers
Example
• Develop a trip-attraction equation using the data given in
table below. Do the necessary statistical checks to assess
the validity of the equation. The table value of ‘t’ for this
case at 5% level of significance, is 1.77

Zone No. No. of Emp. No. of daily


Opportunities in zone work trips
Manufacturing Service attracted
1 60 30 190
2 40 100 290
3 30 20 150
4 20 30 120
5 100 20 250
The stepwise approach to regression analysis

• Step 1: Examine the nature of relationships between the


dependent variable and each of the independent variables
in order to detect nonlinearities

• If nonlinearities are detected, the relationship must be


linearized by transforming the dependent variable or
independent variable, or both
Transformation of variables
Transformation of variables (contd.)

log Y
Y

X log X
Transformation of variables (contd.)
Y

X X
The stepwise approach to regression analysis

• Step 2: Develop an inter-correlation matrix involving all


the variables (including both the dependent and
independent variables)
Inter Correlation matrix
X1 X2 X3 X4 Y
X1 1 0.978 0.496 0.110 0.996
X2 1 0.304 0.068 0.958
X3 1 0.074 0.552
X4 1 0.124
Y 1

• X1: Total Employment


• X2: Manufacturing Employment,
• X3: Retail and Service Employment
• X4: Other Employment
The stepwise approach to regression analysis

• Step 3: Examine the inter-correlation matrix in order to


detect:
– Those independent variables which have a statistical
association with the dependent variable, and
– Potential sources of collinearity between pairs of the
independent variables

• Step 4: If any two independent variables are found to be


highly correlated, eliminate one of the two highly
correlated independent variables
Inter Correlation matrix
X1 X2 X3 X4 Y
X1 1 0.978 0.496 0.110 0.996
X2 1 0.304 0.068 0.958
X3 1 0.074 0.552
X4 1 0.124
Y 1

What is acceptable R2 = ?
• X1: Total Employment
• X2: Manufacturing Employment,
• X3: Retail and Service Employment
• X4: Other Employment
Inference from Inter Correlation matrix

• Variables X1 and X2 have high degree of correlation with


the dependent variable

• Variables X1 and X2, however, are linearly dependent and


can not be used together in the same regression
equation

• Existence of possible nonlinear relation ship between Y


and X4 can be examined, and suitable transformation can
be suggested depending on the relation
Inter Correlation matrix

X1 X2 X3 X4 Y
X1 1 0.978 0.496 0.110 0.996
X2 1 0.304 0.068 0.958
X3 1 0.074 0.552
X4 1 0.124
Y 1

• X1: Total Employment


• X2: Manufacturing Employment,
• X3: Retail and Service Employment
• X4: Other Employment
The stepwise approach to regression analysis

• Step 5: Do regression analysis with the chosen set of


independent variables

• Estimate the parameters of each of the potential


regression equations

• Regression model should be logically and statistically


valid
The stepwise approach to regression analysis

• Step 6: Conduct the relevant tests to assess the goodness of


the model based on logic and statistics. Answers to the
following questions will enable the same.
– What is the magnitude of R2 ?
– Do the partial regression coefficients have the correct
sign and are their magnitudes reasonable?
– Are the partial regression coefficients are statistically
significant?
– Is the magnitude of the constant (intercept) reasonable?
Regression equations
Equatio Parameter Estimate R2 df t0.05
n No. Constant X1 X2 X3 X4
1 61.4 0.93 - - - 0.992 14 1.76
(42)
2 507.7 - 0.98 - - 0.921 14 1.76
(14)
3 25.8 - 0.89 1.29 - 0.996 13 1.77
(51) (17)
4 -69.9 - 1.26 -0.37 0.02 0.998 12 1.78
(3.7) (1.1) (0.06)

• Sign, parameter significance, constant to be verified


Note: Value in parentheses is t-statistic for the coefficient estimate
Selection of appropriate equation
Logical Parameter
Eq. No. Constant
Sign Significance
1 Logical significant Not a problem
2 Logical Significant Large
3 Logical Significant Not a problem
4 Illogical, X3 X3, X4 not significant Not a big problem

F0.05
Eq. No. R2 Adjusted R2 F-Statistic Remarks

1 0.992 0.991 1736 4.60 Accept


2 0.921 0.915 163 4.60 Reject
3 0.996 0.995 1619 3.81 Accept
4 0.998 0.998 1996 3.50 Reject
The stepwise approach to regression analysis

• Step 7: Choose the best of the regression equations based


on the test results

• The selection of the final model among the accepted


regression equations depends on:
– how good are the goodness-of-fit statistics,
– how easily one can forecast the independent variables
and
– how simple is the equation
Usage of variables in regression equation
• Quantitative variables
– Household size, number of cars, etc.
• Categorical variables
– Gender (1-female, 2-male), HH income (1-<10,000, 2-
10,000-20,000,……,6->1,00,000)
• Dummy variables
– If variable has n categories, then (n-1) dummy variables
are created
– For eg: car ownership has 0,1,2+ categories, two dummy
variables are created
– C1: 1 if the household owns one car, otherwise zero
– C2: 1 if the household owns 2 or more, otherwise zero
Some questions……..
• How to check the logical correctness of regression equation?
• By checking the sign of coefficient and value of intercept
constant

• What are the aspects considered to check the statistical


validity of regression equation?
– Depends on reasonable R2 (reasonable means depends on
our requirement of accuracy)
– Standard error of estimate less than standard deviation
– t-statistic (if value of t less than table t value, then it is
not significant)
Some questions……..
• How to choose set of independent variables for regression
analysis?
• By using inter-correlation matrix (if the correlation is
nearly zero, then there is no point in choosing that
variable)

• How to quantify the effectiveness of independent


variables in regression analysis?
• Based on t-value (larger the value of t, the variable is
most significant)
Balancing trip productions and attractions

• Total productions in a study area should be equal to total


attractions

• When the productions and attractions are estimated from


calibrated trip generation equations, total productions may
not match with the total attractions

• Estimates of productions from trip production equations


are considered as more correct than estimates of
attractions from trip attraction equations
Balancing trip productions and attractions
• The reasons are
– Explanatory variables in trip production equations are
accurately estimated as these are collected through
census every decade
– Household based regression equations of trip productions
are more superior than zonal based regression equations
of trip attraction
• In order to match attractions with productions, all
attractions are multiplied with a factor, f = T/∑Aj
• Where, T = total trips = ∑Pi ; Aj = trip attractions of zone j;
Pi = trip productions of zone i
Growth factor modeling
• A technique which may be applied to predict future
number of trips.

Ti = Fi * ti
• Where, Ti and ti are future and current trips in zone i
and Fi is the growth factor
• Fi is related to variables such as population, household
income and car ownership

Fi = f(Ph,Ih,Ch) / f(Pb,Ib,Cb)

• Where, Ph,Ih,Ch are population, income and car


ownership in the horizon year respectively
Growth factor modeling

• Given that a zone has 275 households with car and 275
households without car and the average trip generation
rates for each group is respectively 5.0 and 2.5 trips per
day. Assuming that in the future, all households will have
a car, find the growth factor and future trips from that
zone, assuming that the population and income remains
constant
Category analysis

• Definition: It is a technique for estimating the trip


production characteristics of households, which have
been sorted into number of separate categories according
to a set of properties that characterize the household

• Households are grouped based on their socioeconomic


characteristics into categories
Category analysis
• Most traffic analysis zones tend to contain a mixture of
social and economic classes of people

• The use of regression equations, based on aggregate


measures of zonal characteristics, tend to submerge
important characteristics of travel demand

• Transport planners have proposed that this difficulty may


be overcome by the use of households rather than traffic
zones, as the basic unit of trip making

• This modeling technique which is based on the household is


known as cross classification analysis
Category analysis
• The variance of trip making within the category is assumed
to be negligible

• Household Trip rates for each category are determined and


are assumed to be remain constant

• For example, if households are grouped based on car


ownership (three levels: 0,1,2+) and household size (six
levels: 1,2,3,4,5,6+), then this results into 3 X 6=18
categories.

• Suggested minimum number of observations per cell: 50


This number is decided based on reliable estimation of
mean trip rate
Category analysis for work trips
Household Size
Cars
1 2 3 4 5 6+ Total
0
Trips 255 1231 1149 1111 827 1081 5654
N HHs 828 1341 652 549 389 443 4202
Trip Rate 0.308 0.92 1.76 2.13 2.13 2.44 1.35

1
Trips 301 4844 5781 7466 4956 4879 28227
N HHs 344 2793 2472 3092 2046 1889 12636
Trip Rate 0.875 1.73 2.34 2.41 2.42 2.58 2.23

2+
Trips 8 644 2220 3231 2424 3002 11521
N HHs 5 294 717 1022 726 870 3634
Trip Rate 1.6 2.16 3.10 3.16 3.34 3.45 3.17

Total
Trips 564 6719 9150 11808 8207 8962 45410
N HHs 1177 4428 3841 4663 3161 3202 20472
Trip Rate 0.48 1.52 2.38 2.53 2.60 2.80 2.215
Advantages - Category analysis
• The whole concept of household trip making is simplified in
this technique.
• No mathematical relationship is derived between trip
making and household characteristics
• Since data from the census can be used directly, it saves
considerable effort, time, and money spent on home
interview surveys
• The computations are relatively simpler
• The technique simulates the human behaviour more
realistically than the zonal aggregation process
Advantages - Category analysis
• Ease of understanding by decision makers and the public
• Efficient use of data: If no O-D data is available, a small
stratified sample data is enough
• Validity: The process is valid in forecasting as well as in the
base year accuracy check
• Flexibility: Application at different study levels: zonal,
regional, corridor and so on
• Easy transferability of analysis between cities or parts of the
study areas of the same size and character
• Wide use of data: Census data can be used extensively,
particularly socioeconomic data
Disadvantages – Category analysis
• It is difficult to test the statistical significance of the
various explanatory variables

• The technique normally makes use of the studies in the


past made elsewhere, with broad corrections

• In the analysis, it is assumed that income and car


ownership increase in future.

• New variables can not be introduced at a future date

• Large samples are needed to assign trip rates to any one


category
Zonal trip productions and attractions
from category analysis
Things to remember
• A high R2 (Coefficient of determination) by itself means
little if the t-test is marginal or poor

• Just having a large number of independent variables does


not mean very much. Choose only the independent variable
that have highest correlation with the dependent variable
and low correlation among the independent variables.

• Check the coefficients are logical or not. Trip generation is


never “negative” in reality no matter what value the
independent variable has.
Develop matrices connecting income to automobiles available (use
the table below), and also draw a graph connecting trips per
household to income. How many trips will a household with an
income of 10,000 peso per month owning one auto make per day?
• The average number of trips the household generates in
each cell is calculated.
• For example, the average trip rate for households with
two or more autos and an income between 12,000 and
15,000 pesos/month is 11.5, because households 16 and
17 together make a total of 33 trips. These average
rates are shown.
Trips per household based on household income and car ownership

• A household with Php 10,000 income and one car per


household will make 7.5 trips per day
Trip rates method

• Example:
– A number of suburban zones have a total of 1000
dwelling units (DU). The average income per DU is
42,000 dollars. Using the curves a, b, and c
provided, estimate the number of trips produced by
the zones
Use of Accessibility Measure in Trip
Generation Models
• As no transportation system variable is used in trip
generation analysis, changes in the transportation system
(such as introduction of new high speed road link, metro
rail link, etc.) have no effect on productions and
attractions.

• To overcome this disadvantage, modellers have used


accessibility measure as one of the explanatory variables.

• Accessibility measure is directly related to the


opportunities available and inversely related to the
deterrence in reaching these opportunities.
Use of Accessibility Measure in Trip
Generation Models
𝐴𝑀𝑖 = ෍ 𝑓(𝐸𝑗 , 𝐶𝑖𝑗 )
𝑗
Where, 𝐴𝑀𝑖 = Accessibility Measure; 𝐸𝑗 = Attraction
(opportunities, such as employment) of zone j
𝐶𝑖𝑗 = Generalised cost of travel between zone i and zone j =
𝛼 𝑇𝑖𝑚𝑒 + 𝛽 𝐶𝑜𝑠𝑡

A typical accessibility measure that is generally used is:

𝐴𝑀𝑖 = ෍ 𝐸𝑗 𝑒 −𝛽𝐶𝑖𝑗
𝑗
Where, 𝛽 = calibration parameter
Urban Transportation Planning Process
Formulation of Goals Collection of data Inventory of
and Objectives from traffic survey existing facilities

Development of Travel forecast Land use forecast


alternative •Trip generation •Population
highway and PT •Trip distribution •Economic activity
networks •Future travel demand •Land use

Assignment of Evaluation of Selection and


movements to alternative networks Implementation
alternative Costs, benefits,
networks impacts, practicability
Procedure for
Eliminate bias in HH data and
compute expansion factors Generating
Base Year O-D
Expanded partial O-D Matrices
matrices

Outer cordon O-D


O-D matrices with all trips O-D surveys at terminals
Workplace based surveys

Comparison of trips from O-D


matrices with screen line counts
Adjustment of matrices

Load matrices on to the network and Network data


compare the assigned and observed Screen line data
link flows. Validate matrices. Select Cordon line data
appropriate assignment technique Bus route network

Validated O-D matrices for


base year
FORECASTING
Projection of Planning Variables
using Land-use / demographic
models for the future year

Apply trip-end equations and obtain


future year trip-ends of internal trips

Apply calibrated gravity model


Previous cost/time
and obtain O-D matrix for
skims for initial run
internal trips

Apply mode choice model and obtain


PT, car and two-wheeler O-D matrices
of passenger internal trips
Obtain truck matrix and mode-wise
external O-D matrices by Furness
•Matrix of daily PT (bus+rail+ taxi+walk) method using growth factors
passenger trips
•AM peak and PM peak matrices of car, Regional peak hour to daily
two-wheeler and truck trips in PCU flow ratios
Passenger - PCU conversion
•Assignment of PT passenger trips on to the factors
public transport network
•Assignment of peak-hour PCU trips on Road network data and PT
road network taking peak-hour PT & truck network data for the scenario
PCU flows as preloads under consideration

No Link costs
stable?

Yes
Final Link flows
PT Loadings (Bus, MRTS, Rail, Taxi, walk)
MRTS Boardings and Alightings
Final PT and Highway Cost/Time skims

You might also like