Simultaneous equatitionnnss models are econommettric models for data that jointly determined by two or more dependent variables rather than just one. For example, in demand and supply models, price and quantity are determined by the interaction of two equations. The OLS estimates is not appropriate for these models and we need another way to obtain reliable estimate of economic parameters.
A simple model might look like with X as the income:
QiQi=α0+α1Pi+α2Xi+edi=β0+β1Pi+esi
We would expect the demand slope β1>0 and the supply slope α1<0. The index i could represent different times or locations. P and Q are endogenous random variables (dependent) as their values are determined within the system while X is a random exogenuous variable (independent) and we treat it as a given.
X being exogenous means:
E[edi∣X]E[esi∣X]=0=0
We also assume homoskedastic, no serial correlation and no correlation between the two error terms:
Var(edi∣X)Var(esi∣X)=σd2=σs2
Because the P and Q are jointly determined, there is a feedback between them. Because both the random error terms ed and es affect both P and Q, P is an endogenous variable and is ontempraneously correlated with both error terms:
And π1 and π2 are consistent, and have approximate normal distributions even if the the structural equation errors are not normal. As we can see from the reduced-form equations, a change in edi or esi, will affect Pi and Qi. As we cannot observe the change in the error term, but only through the correlation of Pi or Qi, the estimation of the coefficient will be inconsistent.
The reduced-form equations are also known as the first-stage equation.
The Identification Problem
In the supply and demand model:
QiQi=α0+α1P1+α2Xi+edi=β0+β1Pi+esi
α0, α0, and α0, cannot be consistently estimated by any estimation method (unidentifiable). However, β0 and β1 can be consistently estimated (identifiable).
In a system with M simultaneous equations, which jointly determine the values of M endogenous variables, at least M−1 variables must be absent from an equation for estimation of its parameters to be possible.
For our supply and demand equations, M=2, we require at least M−1=1variable to be omitted from an equation to identify it. In the demand equation, no variables are omitted so the equation is unidentified. In the supply equation, X is omitted, so the supply curve is identified, and its parameter can be estimated.
2SLS
We can use the 2SLS to estimate the coefficient for the equation that is identifiable. Recall in the supply equation that Pi is contemporaneously correlated with esi:
Qi=β0+β1Pi+esi
From the reduced form equation:
Pi=πXi+ν1i=E[Pi∣Xi]+ν1i
Substiuting Pi in the supply equation to the above:
We need at at least M−1 variables must be omitted from each equation.
Econometric Model
We will use the following example model:
Salesi=β0+β2Pricei+β2Adverti+ei
The following triplet is a 3-dimensional random variable with a joint probability distribution:
(Salesi,Pricei,Adverti)
To be strictly exogenous, (Salesi,Pricei,Adverti) has to be independent from (Salesj,Pricej,Advertj) for i=j and:
E[ei∣Pricei,Adverti]=0
This means that the ei does not include any variables that have effect on Sales and also correlated with Price, Advert. This could happen if for example the competitor price and advert will somehow affects another pricing and advert policy and price.
The intercept is not included because it is already included in Sales~i and Advert~i.
FWL states that:
i∑e^i2=i∑e~^i2
β~^1 can be interpreted as change in Sales when Advert is increased by 1 unit and the Price is held constant. Hence:
β~^1=β^2
Even though the coefficients are the same, the error variance would be different as there is only 1 estimated coefficent vs 2 for the original model:
σ~2σ2=i∑N−1e^i2=i∑N−2e^i2
The idea is to partition out the explanatory variables into two groups. One that is the primary focus, and the rest in another group that are the control variables.
For example, we divide the variables (xi1=1,xi2,⋯,xik) into two groups:
Assuming MR1-MR5 hold, the least squares estimation are the “Best Linear Unbiased Estimators (BLUE)” of the parameters in the multiple regression model. Furthermore, if the errors are normally distributed, the error variance σ^2 will follow a t-distribution.
BLUE and t-distribution properties what is called finite sample properties. As long as N>k, the properties will hold. If the assumptions do not hold, we will need to go into large sample or asymptotic properties. This would require N to be sufficiently large.
Variances and Covariances of the Least Squares Estimators
For k=3, we can express the conditional variances and covariances as:
Passionate software developer with a background in CS, Math, and Statistics. Love challenges and solving hard quantitative problems with interest in the area of finance and ML.