life priced,do not fight,peace no war

Demand Estimation by Using Regression Analysis

Regression Analysis a statistical method used to establish a relationship between a variable (Dependent Variable) and other factors that will affect it (Independent Variables).

This relationship can be expressed as a functional form:

Q = a₀ + a₁ A + a₂ B + a₃ C

Demand Estimation for a product or service using regression analysis is important in the business world especially to the corporate executives and managers because it will enable them to make reasonable forecast for their goods and services in the near future. The manager can narrow down those factors that are important in influencing their sales and thereby formulate appropriate strategies or policies to achieve their management objectives.

The actual process of Regression Analysis can be very complex but it can be summarized into FOUR important steps:

Model Specification: Set the objective and identify the important variables which have influence on the dependent variable.
Data collected for all the variables specified.
Choice of a function form

e.g. Linear or non-linear form

Estimation and interpretation of results.

1. Model Specification

If we want to study the factors affecting the demand for automobiles (Qx) in the country, we must identify the most important variables that are believed to affect the demand for automobiles

e.g. a) Price of the automobile (Px)

b) Per capita income (Yc)

c) No. of working population (L)

d) Rate of interest, etc (I)

Qx = f(Px, Yc, L, I,…..)

2. Data collection on the variables.

2 types of data :

a) Time Series Data

Data is collected for each variable over time (yearly, quarterly, monthly or daily, etc)

b) Cross-Sectional Data

Data are collected for same time period but from different section or geographical area of the society.

Types of data to be used depend on the availability of data.

a) Primary data – Data collected from the field through market survey, sampling, & etc.

b) Secondary data – These are published data by relevant authority such as Statistical Department, Economic Reports, etc.

3. Specifying the form of Equation.

i) The simplest model to deal with and the one which is often also the most realistic is the linear model.

e.g. Qx = a₀ +a₁ Px + a₂ Y + a₃L + a₄ I + ……..+ e

a₀,a₁,….,a₄ are parameters (coefficients) to be estimated

e = disturbance term or error term

ii) Non- Linear model

Sometimes a non-linear form may be the data better than a linear equation.

Qx = a₀ Px^α1.Yc^α2. L^α3. I^α4 (Power Function)

4. Testing the (Econometric) Result

To evaluate the regression results several statistics are examined.

a) The sign of each estimated coefficient must be checked to see if it conforms to what is expected on the theoretical grounds.

b) Coefficient of Determination, R²

c) t – tests (coefficient)

d) Durbin-Watson statistics, etc.

e) The F-statistics (F-stats)

Note : The statistical procedure in solving Multiple Regression Problems can be very complicated. Fortunately there are many computer software’s available to achieve our objective.

i.e TSP (Time-Series Processor) or SPSS can be used to solve our problems.
REGRESSION ANALYSIS

It describes the way in which one variable is related to another. Regression analysis derives an equation that can be used to estimate the unknown values of one variable on the basis of known values of another variable.

(a) Simple Regression Analysis

Y = a + bX where Y is sales volume & X is advertising expenditure

Example 1

(Taken from ECO556 Manual Table 4.1, page 136 )

Year	Sales (Y) (million dollars)	Advertising Expenditure (X) (million dollars)
1997 1998 1999 2000 2001 2002 2003 2004 2005 2006	44 58 48 46 42 60 52 54 56 40	10 13 11 12 11 15 12 13 14 9

The result from computer print out :

LS// Dependent variable is SAL SMPL range 1986 - 1995 Number of observation 10
Variable	Coefficient	Std. Error	T-Stat	2-Tail Sig.

C ADV	7.6000000 3.5333333	6.332345 0.5222813	1.2001912 6.751919	0.264 0.000

R-squared 0.851212 Mean of dependent var 50.00000 Adjusted R-squared 0.832614 S.D of dependent var 6.992059 S.E. of regression 2.860653 Sum of squared resid 65.46667 Durbin-Watson stat 1.224915 F-statistic 45.76782 Log likelihood -23.58417

^ ^ ^

Y = a + bX

^ ^ ^

=> Y = 7.6 + 3.53X

(b) Multiple Regression Analysis

Y = a₁ + b ₁ X ₁ + b₂ X ₂

where Y is sales volume , a₁ is the intercept

X ₁ is advertising expenditure , b₁ is the Y/X₁, marginal effect of adv on sales

X ₂ is price of the product , b₂ is the Y/X₂, marginal effect of price on sales

Example 2

(Taken from ECO556 Manual Table 4.3, page 141 )

Year	Sales (Y) (million dollars)	Advertising Expenditure (X1) (million dollars)	Price (X2) (million dollars)
1997 1998 1999 2000 2001 2002 2003 2004 2005 2006	44 58 48 46 42 60 52 54 56 40	10 13 11 12 11 15 12 13 14 9	1 1.2 2 1.8 2.1 0.8 1.4 2.0 1.5 1.0

The result from computer print out :

LS// Dependent variable is SAL SMPL range 1986 - 1995 Number of observation 10
Variable	Coefficient	Std. Error	T-Stat	2-Tail Sig.

C ADV P	11.60403 3.4936051 -2.3836921	6.9633945 0.5078770 1.9495316	1.6665152 6.8788413 -1.2226999	0.140 0.000 0.261

R-squared 0.877397 Mean of dependent var 50.00000 Adjusted R-squared 0.842367 S.D of dependent var 6.992059 S.E. of regression 2.776058 Sum of squared resid 53.94549 Durbin-Watson stat 1.41 F-statistic 25.04734

^ ^ ^ ^

Y = a₁ + b₁X₁ + b₂X₂

^ ^ ^ ^

=> Y = 11.60 + 3.49X₁ - 2.38X₂

Evaluation of Results (Computer Printouts)

These are the importance statistical results should be interpreted:

The sign of each estimated coefficient
Coefficient of determination (R²)
Standard error of estimate (Se)
The t-statistics (t-stats)
The F-statistics (F-stats)

Interpretation :

a. The sign of each estimated coefficient must be checked to see if it conforms to what is expected on the theoretical grounds.

^ ^ ^

From Example 1: Y = 7.6 + 3.53X

The estimated function show positive value (+ 3.53) , so it conforms to the expected economic theory. If we spend $1 on Advertisement (X) then the Sales(Y) will increase by 3.53 units.

b. Coefficient of determination (R²)

The value of R² ranges from ‘0’ to ‘1’

R²= ‘0’ (it shows that none of the independent variables explain the changes in the dependent variable)

R²= ‘1’ (it shows that all the changes in the dependent variable is explained by the variation in the independent variables)

R²= ‘0.85’ (it shows that 85% of the changes in the dependent variables is explained by the variation in the independent variables, advertising expenditure. The other 15% cannot be explaine by the regression analysis. This may be due to the omission of some important independent variables.)

c. Standard error of estimate (Se)

It is a measure of dispersion of data points from the line of best fit (regression line). Actual points do not lie on the regression line but are dispersed above and below the line. Thus, the value predicted by regression line will be subjected to error. Therefore, the Se measures the probable error in the predicted value.

For example, data from table 4.1, when the advertising expenditure is $9 the sales is $40. If we use the regression results, the sales is $39.37. Therefore the value predicted will have an error.

The std. error of estimation can be calculated by using the following formula:

n ٨

Se = Σ (Y t – Y) ²

t=1

n - k

Se is useful to estimate the range within which the dependent variable will lie at a specified probability. At 95% probability the dependent variable will lie in the predicted interval of :

Y + t _{n – k} * Se

Where Y is the predicted value of dependent value based on the regression,

n – k is the degree of freedom (df), it is used to get the critical value for students’ distribution, n is the number of observation and k is the number of coefficient estimated.

Example :

Se = 2.8 At 95% confidence interval of sales when Adv. Exp. (X) = 9 and ٨ ٨

Y = 39.37 then Y + t _{n – k} * Se

=> 39.37 + (2.306)(2.8)

ð 39.37 + 6.457

Thus, at 95% C.I. when adv. Exp. Is $9 million, the range of Sales from $32.913 to $45.827 million

d. T-Statistics

The t-statistics is used in t – test to determine if there is a significant relationship between the dependent and each of the independent variable. To do this test, we

need the std. error of coefficient (Sb) and calculate the ‘t’ value. Then we compare the calculated ‘t’ value and the critical ‘t’ value from the student ‘t’ distribution table.

The ‘t’ value is calculated by dividing the value of coefficient (b) by Sb :

Calculated t = b

i.e : Calculated t = 3.53 = 6.79

0.52

To calculate the critical value from student ‘t’ distribution table:

n – k = 10 – 2 = 8 df at 95% C.I and the ‘t critical ‘ = 2.306

Since t computed ( 6.79) > t critical (2.306) then adv.exp. is statistically significant in explaining the variations in sales at 95% C.I.

Note: if there is more than one independent variable then you have to test significance for all the independent vars.

e. Durbin-Watson Statistics

It indicates that whether the presence or absence of auto correlation means the problem that can arise in regression analysis with time series data.

There are 3 possibilities where autocorrelation or multi-co linearity problem can arise:

· When independent variables are interrelated or duplicated

· Where independent variables have been miss- specified

· Where important independent variables are found missing.

f. F-statistics

It is another test of overall explanatory power of regression analysis. (Refer pg 147 manual)

----end of short notes on demand estimation----

BlogSnipers

Demand Estimation by Using Regression Analysis

Example 1

Example 2

Rasa macam nak aktif semula

References for report writing