A dummy variable is actually a variable that is used in the statistical methods and analysis to represent the values of sub groups of the sample in numerical manner. In case of the research design the dummy variables are most probably used for creating the difference between the values that are used by the treated group. Most commonly the dummy variables are allotted with the values of 0 and 1, in which 0 is used for the control group and the 1 is used for the treatment group. These variables show significance in a way that they represent the multiple group values on the single regression equation.  Another advantage of the dummy coded variables is that they are used as nominal level variables with the appropriate distributions. 

In order to understand the function of dummy variable, the regression equation is considered for the evaluation of 0 and 1 function. [sky]

yi = B0+B1Zi+ei

From the above equation y is the value that is the concluded from the ith unit, B0 is the coefficient value for an intercept other than that B1 is the coefficient value for the slope which is actually the difference between the two groups that are treatment and control group; Zi is the fluctuating value with 0 and 1 from the treatment and control groups and the e is the residual value which comprises of the errors in the respective model or analysis. In order to examine the function of dummy variable then consider the above stated model in which calculation are done approximately to accumulate the values of the difference.  The huge number of information can be comprised in one equation by using the dummy variables. To analyze the statement consider, the equation is treated differently for both the groups. Firstly examining the values for the control group with its allotted value that is 0 which is actually the variable Z, when the value is putted in the above stated equation the results are originated in the form of single variable equation  by assuming that the occurrence of errors in the analysis is zero.

The equation that is concluded is                                

yc = B0

Similarly in case of the treatment group the variable Z came up with the value 1 with the possibilities of errors are considered to be zero then the equation is                           

yt =Bo+B1

Now the solution for the B1 is examined which is the difference between the two groups so by subtracting the values for both the groups the concluded equation is

yt -yc = B0

So the multiple variables are stated in the single equation with the use of dummy variables. Almost in all the cases of the dummy variables, they are used to represent different sub group equations. They are used as stated in the following two cases. [linkunit]

• Finding and evaluating the difference of values of the multiple groups by stating their values in the different equations.

• Formation of the separate equations for an each individual group by replacing the dummy variables.


• Keisler, H.J. and Robbin, J., (1996), Mathematical Logic and Computability.
• Trochim, W.M.K (2006), Dummy Variables. “Research Methods the knowledge base”.