Asked  8 Months ago    Answers:  5   Viewed   188 times

I have a system of linear equations that make up an NxM matrix (i.e. Non-square) which I need to solve - or at least attempt to solve in order to show that there is no solution to the system. (more likely than not, there will be no solution)

As I understand it, if my matrix is not square (over or under-determined), then no exact solution can be found - am I correct in thinking this? Is there a way to transform my matrix into a square matrix in order to calculate the determinate, apply Gaussian Elimination, Cramer's rule, etc?

It may be worth mentioning that the coefficients of my unknowns may be zero, so in certain, rare cases it would be possible to have a zero-column or zero-row.



Whether or not your matrix is square is not what determines the solution space. It is the rank of the matrix compared to the number of columns that determines that (see the rank-nullity theorem). In general you can have zero, one or an infinite number of solutions to a linear system of equations, depending on its rank and nullity relationship.

To answer your question, however, you can use Gaussian elimination to find the rank of the matrix and, if this indicates that solutions exist, find a particular solution x0 and the nullspace Null(A) of the matrix. Then, you can describe all your solutions as x = x0 + xn, where xn represents any element of Null(A). For example, if a matrix is full rank its nullspace will be empty and the linear system will have at most one solution. If its rank is also equal to the number of rows, then you have one unique solution. If the nullspace is of dimension one, then your solution will be a line that passes through x0, any point on that line satisfying the linear equations.

Wednesday, October 13, 2021

I think Eigen is what you're looking for.

It is a headers only library and compiles on many compilers. It even uses exotic assembly for faster math.

This is the page that shows off the linear solver api.

It has a few solvers with a simple api.

Friday, July 30, 2021
Aidan D

In a comment the poster specifically asks about using solve and optim so we show how to solve this (1) by hand, (2) using solve, (3) using optim and (4) a fixed point iteration.

1) by hand First note that if we write a = 5/b based on the first equation and substitute that into the second equation we get sqrt(5/b * b^2) = sqrt(5 * b) = 10 so b = 20 and a = 0.25.

2) solve Regarding the use of solve these equations can be transformed into linear form by taking the log of both sides giving:

log(a) + log(b) = log(5)
0.5 * (loga + 2 * log(b)) = log(10)

which can be expressed as:

m <- matrix(c(1, .5, 1, 1), 2)
exp(solve(m, log(c(5, 10))))
## [1]  0.25 20.00

3) optim Using optim we can write this where fn is from the question. fn2 is formed by subtracting off the RHS of the equations and using crossprod to form the sum of squares.

fn2 <- function(x) crossprod( fn(x[1], x[2]) - c(5, 10))
optim(c(1, 1), fn2)


[1]  0.2500805 19.9958117

[1] 5.51508e-07

function gradient 
      97       NA 

[1] 0


4) fixed point For this one rewrite the equations in a fixed point form, i.e. in the form c(a, b) = f(c(a, b)) and then iterate. In general, there will be several ways to do this and not all of them will converge but in this case this seems to work. We use starting values of 1 for both a and b and divide both side of the first equation by b to get the first equation in fixed point form and we divide both sides of the second equation by sqrt(a) to get the second equation in fixed point form:

a <- b <- 1  # starting values
for(i in 1:100) {
  a = 5 / b
  b = 10 / sqrt(a)

data.frame(a, b)
##      a  b
## 1 0.25 20
Wednesday, August 11, 2021

You are trying to solve least squares with box constraints. Standard sparse least squares algorithms include LSQR and more recently, LSMR. These only require you to apply matrix-vector products. To add in the constraints, realize that if you are in the interior of the box (none of the constraints are "active"), then you proceed with whatever interior point method you chose. For all active constraints, the next iteration you perform will either deactivate the constraint, or constrain you to move along the constraint hyperplane. With some (conceptually relatively simple) suitable modifications to the algorithm you choose, you can implement these constraints.

Generally however, you can use any convex optimization package. I have personally solved this exact type of problem using the Matlab package CVX, which uses SDPT3/SeDuMi for a backend. CVX is merely a very convenient wrapper around these semidefinite program solvers.

Tuesday, September 21, 2021

Let X be the collection of all 100,000 xs you got (such that the i-th column of X equals the x_i-th vector).
In the same manner we can define Y and C as 2D collections of ys and cs respectively.

What you wish to solve is for A and B such that

C = AX + BY

You have 2 * 50,000^2 unknowns (all entries of A and B) and numel(C) equations.

So, if the number of data vectors you have is 100,000 you have a single solution (up to linearly dependent samples). If you have more than 100,000 samples you may seek for a least-squares solution.


C = [A B] * [X ; Y]  ==>  [X' Y'] * [A';B'] = C'

So, I suppose

[A' ; B'] = pinv( [X' Y'] ) * C'

In matlab:

ABt = pinv( [X' Y'] ) * C';
A = ABt(1:50000,:)';
B = ABt(50001:end,:)';

Correct me if I'm wrong...

It seems like there is quite a fuss around dimensionality here. So, I'll try and make it as clear as possible.

Model: There are two (unknown) matrices A and B, each of size 50,000x50,000 (total 5e9 unknowns).
An observation is a triplet of vectors: (x,y,c) each such vector has 50,000 elements (total of 150,000 observed points at each sample). The underlying model assumption is that an observation is generated by c = Ax + By in this model.
The task: given n observations (that is n triplets of vectors { (x_i, y_i, c_i) }_i=1..n) the task is to uncover A and B.

Now, each sample (x_i,y_i,c_i) induces 50,000 equations of the form c_i = Ax_i + By_i in the unknown A and B. If the number of samples n is greater than 100,000, then there are more than 50,000 * 100,000 ( > 5e9 ) equations and the system is over constraint.

To write the system in a matrix form I proposed to stack all observations into matrices:

  • A matrix X of size 50,000 x n with its i-th column equals to observed x_i
  • A matrix Y of size 50,000 x n with its i-th column equals to observed y_i
  • A matrix C of size 50,000 x n with its i-th column equals to observed c_i

With these matrices we can write the model as:

C = A*X + B*Y

I hope this clears things up a bit.

Thank you @Dan and @woodchips for your interest and enlightening comments.

EDIT (2):
Submitting the following code to octave. In this example instead of 50,000 dimension I work with only 2, instead of n=100,000 observations I settled for n=100:

n = 100;
A = rand(2,2);
B = rand(2,2);
X = rand(2,n);
Y = rand(2,n);
C = A*X + B*Y + .001*randn(size(X)); % adding noise to observations 
ABt = pinv( [ X' Y'] ) * C';

Checking the difference between ground truth model (A and B) and recovered ABt:

ABt - [A' ; B']


  ans =

   5.8457e-05   3.0483e-04
   1.1023e-04   6.1842e-05
  -1.2277e-04  -3.2866e-04
  -3.1930e-05  -5.2149e-05

Which is close enough to zero. (remember, the observations were noisy and solution is a least-square one).

Monday, November 22, 2021
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :