Qiskit-Optimization module Internals -Part 2
In the previous blog, we dived into the source code of quadratic programs. To understand quadratic programs well, we had to also dive into source codes of other classes within Qiskit Optimization, which can be seen here. That blog covered everything we need to understand the source codes of converters. So now is the right time to actually dive into converters.
Why do we need converters? The main reason is that it makes it simpler to solve some problems after converting them into another format and the other reason is that certain algorithms work on a certain type of quadratic program and to bring it into that format, one requires to convert the original problem.
If you look at the source code of any converter class, you will see they inherit this abstract class called QuadraticPorgramConverter. Abstract classes basically provide a template or interface for the classes that inherit them. So all the converters we see would have a convert method that takes a quadratic program as an input and an interpret method that takes the solution of a converted problem as a list or an array. All other method names in the source code are prefixed by an underscore (_) indicating that they are private, which means that though we as users can access them if we want, they are really not intended for use because they are used within the convert method, so they are called anyway when we run convert, but since we want to understand the source codes well, these methods are very important to dive into. Let’s start with the inequality to equality converter. We will take quadratic programs from the converters tutorial mainly because I have checked that they are solvable by docplex and we require to solve for us to understand the interpret method. We will start by first creating a quadratic program:
Next we use the inequality to equality converter in the following way:
We see some slack variables are created and the inequality constraints have got converted to equality. It is intuitive that for LE constraint, the slack variable gets added with a + sign and for a GE constraint, it gets added with a — sign to ensure equality. Let us dive deeper into the source code. Below is a snapshot of the constructor method (__init__) and a part of the convert method, we will go over the rest of convert soon.
The standard design for all converters is to start with a src (source) quadratic program and create a dst (destination) quadratic program in which we would be adding the converted quadratic program. Before looking at the attribute ‘mode’, we can see that all the variables and objective of src are directly added to dst because these would be unaffected by the addition of slack variables in the constraints. The attribute mode is basically there so that we can specify if we want integer or continuous slack variables. It is intuitive to understand why we cannot have binary slack variables. Below is how the auto mode works, the snapshots include a _any_float method and a part of a important method that gets called within the convert method. (Note: In order to understand the source codes, one way I find useful is to re-write methods as normal functions. In this context, what that would mean is that we set src to the quadratic program (qp) we just created, set dst to an empty quadratic program, remove the ‘self’ argument from the re-written methods since that points to the class object, but we have re-written them as normal functions. Within the functions or loops, we can try printing outputs for steps we feel we need to understand better).
The _any_float method checks if there is float in an array that cannot be written as an integer. For eg: 5.0 is a float, but it can be written an integer equivalent 5. 5.5 is a float, but doesn’t have an integer version. We look at the coefficients of the linear constraint (in a quadratic constraint, we will look at the coefficients of the linear part and quadratic part separately, as in the _add_slack_var_quadratic_constraint method). The auto mode takes the slack variables to be continuous if there are only float coefficients that cannot be represented as integers, else it takes the slack variables to be integers. In the part of _add_slack_var_linear_constraint we have above, we also see that if the mode is integer, the rhs of the constraints are floored or ceiled. It is intuitive why for LE, we take floor and GE, we take ceil. The rest of the _add_slack_var_linear_constraint method looks like below:
Remember we saw this class called ExpressionBounds above. It is being used here to find bounds of linear constraints. What is a bound of a linear constraint or any linear expression? The below property from LinearExpression class shows it, but before we go over that we can see that the bounds for new slack variables are computed using these bounds of expressions, the rhs of the constraints and depending on if the constraints are GE or LE. The new slack variables are named in a certain format and we see in our example of converted quadratic problem that we have xyz_leq@int_slack and GE or LE are replaced by ==. Coming back to expression bounds, It basically looks at the coefficient of each variable in the constraint’s LHS, finds the lower and upper bound of each variable and then, the lower bound of the expression is just an aggregation of min( coefficient * var’s lower bound, coefficient * var’s upper bound) for every variable (var) in the linear constraint expression. Similarly, the upper bound of linear constraint or any linear expression is aggregation of max( coefficient * var’s lower bound, coefficient * var’s upper bound) over all variables in the linear constraint.
The addition of slack variables for quadratic constraints is very similar to what we saw in linear constraints, just that for quadratic constraints, we have both linear and quadratic parts. We don’t have quadratic constraints in the quadratic program we chose as an example, but the only change is in computing the bounds of the slack variables because now we need to take into account bounds for the quadratic expressions in the quadratic constraints as well. We can see the change in the below snapshot of the part of the method _add_slack_var_quadratic_constraint
The last bit is to solve the converted problem and get back solution for the original problem from that. We will use docplex to solve our converted quadratic program and to get back solution for the original problem, we use the interpret method. I did it in a notebook with some markdown comments, so I will just take a snapshot from my notebook and put it here (note that qp_eq was created above):
Now, let’s move on to the next converter: Integer to binary. We can apply it as following:
As the name suggests, we want to convert integer variables to binary variables. Intuitively, how would one do this? The way I can think is, if I have an integer variable that has a lower bound of 0 and an upper bound of 7, I will introduce 8 binary variables each representing the 8 possible values the variable can take and then, for any possible solution of the quadratic program, only one of these 8 binary variables would take the value 1 and the rest of them would be 0. Let x be the integer variable we are talking about. By our intuitive thought process, x would be broken down into say, x0, x1, x2, x3, x4, x5, x6 and x7 where each of these are binary. Now suppose in the objective function, x occurs as a term, 3x. This 3x would now would have to written as 3(0*x0 + 1*x1 + 2*x2 + 3*x3 + 4*x4 + 5*x5 + 6*x6 + 7*x7). Why are we taking 1*x1, 2*x2, 3*x3 and so on, that’s because each of x0 to x7 are binary variables representing integer values 0 to 7. This means that after breaking an integer variable into binary components, intuitively, while representing anywhere that integer occurs in the objective or in the constraints, we would require the original coefficients of linear and quadratic expressions where this integer occurred (i.e. 3 of 3x that we considered in our example), along with some other coefficients associated to these binary variables (0 to 7 in our case).
In the source code, it’s very similar to what we have been thinking. The difference is that they are not breaking down an integer into exactly the same number of binary variables as the possible values the integer variable can take. What they are doing can be seen from the method below, taken from the source code:
coeffs in the above code are basically analogous to the coefficients 0 to 7 we considered in our thinking and we can see that the number of binary variables created are int(log2(upper bound — lower bound))+1. Why? I think the logic here is, one needs approximately int(log2(variable range)) number of bits to cover all the values that a integer variable takes. In our eg: the range is 7–0=7 and 7 can be written with 3 bits. Since int(log2(7)) is 2, I need to add 1 more bit and that’s how I take the number of binary variables w.r.t the integer variable to be int(log2(upper bound — lower bound))+1. Since we are taking 3 binary variables instead of 8 as per our intuitive thinking, of course, our coefficients can’t be 0 to 7 anymore. For our integer variable, the coefficients for the 3 binary variables, turn out to be [1,2,4]. The method ultimately gives us a list of tuples, like for integer variable x, it would return [(x@0, 1), (x@1, 2), (x@2, 4)]. (This approach seems of course, better than what we thought intuitively, mainly because for our approach, there will be many more binary variables created, equal to the range of the integer variable and for every potential solution of the problem, only 1 of them would actually be 1 and the rest would add 0 value) How does this get used in the converter code? One can see that from the part of convert method below:
We see that for any integer in src (source or original quadratic program), we use _covert_var method and then, we have an attribute called _conv which is of the type Dict, where we set the original integer name as the key and set the value as the list of tuples that we get from _convert_var. The names from each tuple associated to the integer variable are then, just added as binary variables in dst (destination or converted quadratic program). If the variables are continuous or binary in src, they are simply added to dst as they are. If there are no integer variables, dst is just taken as an exact copy of src. After conversion of integer variables to binary variables, we need to think of how we would perform the substitution in the objective function and the constraints. That is performed using the methods below. We can see that the final changes to the objectives and constraints are actually very similar to how we were thinking intuitively. For all linear expressions in the objective and in the constraints, the changes are made according to the below method:
We can see that like how we thought intuitively, the coefficients in the linear expression change by taking the product of the original coefficient and the specific coefficients taken w.r.t each newly created binary variable. In our eg, we had done 3x = 3(0*x0 + 1*x1 + 2*x2 + 3*x3 + 4*x4 + 5*x5 + 6*x6 + 7*x7), but we never thought of constant in our intuitive thinking process. We can see how they are computing constant by aggregating the product of original coefficient and the original integer variable’s lower bound across all integer variables present in the original problem. I am not entirely sure if I get the logic behind that, but if we think about it, even though we are converting integer variables to binary variables, we need keep track of the lower bound and upper bound of the original integer variable. In _convert_var method, there was bounded_coef, but we also need something to take into account the lower bound of the integer variable and that’s done by calculating constant this way.
For the quadratic expressions in the objective and constraints, the changes happen very similarly. Note that in quadratic parts, we need to consider every tuple of variables. Below is the method, _convert_quadratic_coefficients_dict:
We can note that, linear.get(y.name, 0.0) would give us the actual value of y if it is already there in the linear expression (taken as a dict), else it would add y to the dict and set the value of it as 0.0. We can see that the conversion is similar to the linear case and it’s very intuitive for the quadratic parts. For eg: if we had 3xy where x is integer variable and y is not, then, 3xy = 3(1*x@0 + 2*x@1 + 4*x@2)*y (we are taking the same x example). If y is also an integer variable, then 3xy = 3(1*x@0 + 2*x@1 + 4*x@2)*(1*y@0 + 2*y@1 + 4*y@2) (assuming y also has the same bounds, 0 and 7). We can notice the linear parts are arrived in a somewhat similar way to how constant was arrived at in the case of linear expressions.
Finally all this conversion is applied in _substitute_int_var method for every linear and quadratic expression in the objective as well as in the constraints. The modified objective and constraints are added to the dst quadratic program within this method. We can see _substitute_int_var being called in the convert method. Let’s talk about interpret method now. Since I have done it in my own notebook, I can just put the snapshot here:
Let’s consider Linear equality to penalty converter now. This converter helps us modify the objective in such a way that we can get rid of linear equality constraints. The reason this is useful is that constraints always make the problems more difficult to solve, and the difficulty increases with the number of constraints and the type of constraints.
All variables (integer, binary and continuous) are added to dst from src, as they are. There are no new variables created. In fact, dst has no linear equality constraints anymore. Since our problem had only linear equality constraints post applying the inequalities to equalities converter, there are no constraints at all in the converted quadratic program for us. The objective has of course, changed to take into account the absence of the linear equality constraints that were previously there. This converter can only work if among linear constraints, there are only equality constraints. From the part of convert method below taken from the source code of this converter, we can see that a QiskitOptimizationError is raised if among linear constraints, there are any inequality constraints.
Now, it only remains to see how the penalty in the above code is calculated. As suggested before, to understand these methods better, re-writing them as usual functions mainly by removing the object reference ‘self’ and printing out line outputs within loops, helps a lot. Coming to penalty, if any of the coefficients in any of the linear equality constraints, including the rhs from all the linear equality constraints is a float that cannot be written as an integer (eg: 3.5), it throws a warning and a default penalty set to 1e5 is used.
If the above is not the case, the penalty is calculated using Expression bounds of the linear and quadratic expressions of the objective. This is what the method _auto_define_penalty gives out as penalty:
Coming to the interpret method, I will again put a snapshot of what I did in my notebook. You will see that the interpret method gives out the same solution for the original problem as the converted problem because the variables have not changed, it is just a modified objective that compensates for the absence of linear equality constraints that were present in the original problem.
The other converter is linear inequality to penalty which would help us modify the objective so that we don’t have linear inequalities. From the source code, the penalty is calculated exactly the same way as for the linear equality to penalty converter. The interpret method also, of course, works the same way. The convert, however, uses some ways to match constraints to specific types of constraints and there is a conversion table for these. I won’t go into further details. I think an alternative could be to convert linear inequalities to equalities and then, just use the linear equality to penalty converter. There is also flip problem sense converter which basically helps us convert the objectives from max to min or min to max, which is very intuitive. We also have quadratic program to qubo. I will cover QUBO in a future blog, but we can see that this converter basically uses all the converters we discussed in this blog.
I will end this blog here. I hope you liked it. Let me know how it can be improved. I would like to say that though this blog is about internals of a small part of a wonderful Quantum Computing framework that we love using, there is not much about Quantum Computing itself discussed in the blog, but this can be treated as an important pre-requisite for understanding a lot of quantum computing algorithms from source codes that we should dive into in future blogs. The intention of this blog was also to show how and what my experience while going through source codes was. Thank you!!