Vectors

I learned vectors in School in seven different courses, taught as if new, in three different styles:

Vectors as founded on two and three dimensional geometry,
Vectors as n-tuples of real numbers,
Vectors as described axiomatically.

The prettiest and most elegant style is from axioms. We pursue the axioms here. Here we describe some benefits of vector axioms.

This note is a brief presentation of vectors from the same perspective as Halmos’ excellent small volume: “Finite-Dimensional Vector Spaces” (ISBN: 0387900934). I add nothing here, and of course omit much.

I dedicate this page to Marvin Epstein who taught linear algebra at Berkeley in 1953. He was perhaps the best math teacher I have ever known. He made this stuff seem simple, elegant and important.

The Vector Space

A vector space is always associated with some field. If you don’t know what a field is read “real numbers” for “field” and you won’t miss anything in this note. We call the elements of the field “scalars” for no better reason than to distinguish them from vectors. We use Greek letters α, β, γ to denote scalars and small Latin letters u, v, w to denote vectors. Aside from the operators that come with the field there are two others in a vector space, addition of two vectors yielding a vector and scalar multiplication of a scalar by a vector yielding a vector. Thus αu is a vector resulting from multiplying the scalar α by the vector u, and u+v is the vector sum of vectors u and v.
There are several axioms governing these new operations and relating them to the operations in the field.

There is a vector, 0, such that for all x, x+0 = x.
(u+v)+w = u+(v+w)
u+v = v+u
α(u+v) = αu+αv
(α + β)u = αu + βu
α(βu) = (αβ)u
0u = 0
1u = u

The “0” in “0u” is a scalar, of course, because you can not put a vector there. Ditto for “1” in “1u”. The right sides of the equations are vectors because that is what this kind of multiplication yields. It should be noted that all of these equations are true if we interpreted u, v and w as scalars. Vectors thus behave much like numbers.
(The vector 0 is unique.)
Perhaps the simplest thing about vectors that is not true of scalars is that we may have two vectors u and v so that αu+βv = 0 only when α=β=0. If vectors were merely numbers then we could let α=v and β=−u to solve the equation. This not generally possible in vector spaces.
Note that if we let the vectors be the field then all of the axioms are satisfied. This is a very simple example of a vector space. An even simpler one is the vector space that consists merely of the single vector 0. The simplest non trivial vector space may be illustrated by ordered pairs of scalars <α, β>—but we must give the rule for adding and scalar multiplication before we can claim to have presented a vector space.

<α, β> + <γ, δ> = <α + γ, β + δ>
α<β, γ> = <αβ, αγ>

It is easy and only slightly tedious to verify that the vector axioms are satisfied by these definitions. It may help to note that if the field is the reals then <α, β> may be taken as the vector from the origin to the point with Cartesian coördinates α, β. If we let u = <0, 1> and v = <1, 0> then we indeed cannot solve αu+βv = 0 except for α=β=0.

Tuples as vector space

n-tuples of field elements provide a vector space when analogous definitions for addition and scalar multiplication are provided. If α_i is the ith scalar of an n-tuple posing as vector u and ditto β_i for vector v then the n-tuple w with scalars γ_i = α_i + β_i form the sum w = u + v, and vector x = εu with ith component εα_i is the product of scalar ε with vector u. Note that two different n-tuples of scalars are distinct vectors for their difference is not 0. If the field is finite with k elements then the new vector space has kⁿ elements.

Functions as vector space

A less widely known vector space is the space of real valued functions of some variable ranging over any fixed domain. The domain of the functions does not matter. Vector addition and scalar multiplication are defined thus:

(f+g)(x) = f(x) + g(x)
(αf)(x) = α(f(x))

Again the vector axioms are easily verified.

Linear Operators

A linear operator is a function that operates on a vector in one vector space, its domain, and yields a vector in another vector space, its range. The two spaces must have the same field. We denote linear operators by large Latin letters A, B, C. Linear operators obey the following axioms:

A(u+v) = Au + Av
A(αu) = α(Au)

In the rest of this note we mean “linear operator” by “operator”. The meanest operator is Au=0 for all u. Au=u is slightly more interesting—the domain and range are the same. These two operators are called 0 and I respectively. For our Cartesian vector space we may define an operator A<α, β> = <β, −α>. This is a rotation of 90 degrees. A full fledged operator here might be A<α, β> = <3.7α + 6β, 2.1α + 5β>. Operators on function space include the derivative and

(Af)(x) = f(x+3).

Operator as Vector Space

We now make a simple, powerful but confusing observation—given two vector spaces D and R over the same field, the set of all operators with domain D and values in R is itself a vector space over the same field! We must define the addition of and scalar multiplication of operators first:

(A+B)u = Au+Bu
(αA)u = α(Au)

Note that in each case we have defined a new operator by providing a formula that defines the result of applying that operator, (A+B) or (αA) to some arbitrary vector u. We also decree that two operators A and B are equal if and only if Au=Bu for all u. Now it is both confusing and tedious to verify the axioms as we have three vector spaces, the domain, the range and the tentative vector space that the operators themselves compose. We illustrate the axiom α(u+v) = αu+αv. To show that two operators A and B (which we view as vectors) are equal we must show that Au = Bu for all u.

(αA+αB)u = (αA)u + (αB)u = α(Au) + α(Bu) = α(Au+Bu) = α((A+B)u) = (α(A+B))u

We thus see that (αA+αB) = (α(A+B)) since u was arbitrary.
Similarly (α+β)u = αu+βu is proven for operators thus:

((α+β)A)u = (α+β)(Au) = α(Au) + β(Au) = (αA)u + (βA)u = ((αA) + (βA))u

and α(βA) = (αβ)A is proven:

(α(βA))u = α((βA)u) = α(β(Au)) = (αβ)(Au) = ((αβ)A)u

The others are even easier.
When the domain and range are the same space the space of operators has a natural multiplication defined: (AB)u = A(Bu). The following are easily derived:

(AB)C = A(BC)
A(B+C) = AB + AC
(A+B)C = AC + BC
(αA)B = A(αB) = α(AB)

Notably absent is AB = BA for this is not generally true.

If the domain of A is the same space as the range of B then AB is taken to be the composition of A and B which is an operator from the domain of B to the range of A. As before (AB)u = A(Bu). All of the above identities still hold.

Models

An apocryphal anecdote concerns a mathematician who studied some sort of algebraic structures for some years before discovering that there were none. In short his axioms were inconsistent! We have already provided a two examples of vector spaces:

The field itself,
Ordered n-tulpes of field elements.

How many non-isomorphic vector spaces are there? We will hold the field constant. To study this question we introduce the notion of independence of a set of vectors.
A finite set {u_i} of vectors is independent whenever Σα_iu_i = 0 implies that for each i α_i= 0. If B is the set of all of the u_i then the value Σα_iu_i is called a linear combination of elements of B and B is said to span the set of all such values. If the set B is independent and spans the entire space, then B is called a basis for the space.
Consider the following process for discovering a basis B for a space X: Let B be initially empty. While possible do:

Choose a vector in X that is not a linear combination of vectors in B.
Add that vector to B.

When any point in X can be represented by Σα_iu_i with u_i in B, it is impossible to continue this process and the resulting set B is a basis for the space X. Note that there is much latitude in choosing vectors in this process. For example any vector but 0 will serve as the first element of B.

It turns out that the number of elements in B does not depend on these choices of elements for B. That number is the dimensionality of X. More commonly said: “X is an n-dimensional vector space”. For some vector spaces this process of choosing base elements for B does not terminate. The functional vector spaces above are like this when the function’s domain is infinite.

If X and Y are vector spaces over the same field and with the same finite dimensionality then they are isomorphic. We can choose a basis in each, {x_i} and {y_i} and then the bijection map Σα_ix_i ↔ Σα_iy_i becomes the map of the isomorphism.

Here is a digression on infinite dimensional vector spaces.

If the two spaces X and Y are the same but we choose two bases, then we have an automorphism of the space. Automorphisms comprise all of the symmetries of a vector space. In such a case we have a basis {u_i} and another bases {v_i} for the same vector space. It is profitable to consider expressing the members of one basis in terms of the other basis. In general we have u_i = Σα_ijv_j where the α_ij are a collection of n² scalars called a square matrix. If we first pick one basis {v_j} then almost any n² scalars will do to generate another basis by the above formula, the condition being that the determinant of their matrix not be 0. The generated u_i’s will span the space if and only if the determinant is not 0. When the determinant is not 0 we may find n² β_ij’s such that v_i = Σβ_iju_j. In that case the matrix of β’s and the matrix of α’s are inverses of each other. We dwell on this conversion here.

When we have two vector spaces, X and Y, over the same field but of respective dimensions m and n, we can choose a basis in each, say {u_j} and {v_k}, (0≤i<m & 0≤k<n). For any linear operator A from X to Y there is a set of mn scalars γ_kj so that for any x∊X there are scalars {α_j} and {b_j} such that:
x = Σα_ju_j
A(Σα_ju_j) = Σβ_jv_j
β_j = Σγ_kjα_k.
The rectangular n by m matrix γ_kj is said to define the operator. These γ_kj’s depend on both choices of basis. There is a one to one correspondence between such n by m rectangular matrices and linear operators. All the familiar rules for mechanically adding and multiplying matrices can be derived from these notions. Addition of matrices corresponds to addition of operators and multiplication of matrices corresponds to composition of operators.

We may want to pursue these ideas into infinite dimensional vector spaces but this in not quite trivial. We may well consider infinite, or at least denumerable basis sets. We will require a topology to give meaning to convergence of infinite sums Σα_iu_i. The subject was addressed elegantly by David Hilbert in about 1905 and we only touch on it here.

We speak above of constructing a vector space by merely inventing basis vectors out of thin air and then just assuming that there must be some vector space, satisfying the axioms, and for which these vectors form a basis. This is a common practice and here is the excuse for such talk:
As we already mentioned the field of a vector space itself satisfies the vector space axioms by taking the multiply from the field for scalar multiplication, and taking addition in the field for vector addition. This results in V₁, a one dimensional vector space for which {1} (1 of the field) is the natural basis. We can now form a new 2D vector space V₂ = V₁⊕V₁ as a direct sum with elements <α, β> as elements of V₂. One must verify that the axioms of a vector space are inherited by direct sums, but this is fairly easy. That <α, β> + <γ, δ> = <α + γ, β + δ> is the normal assumption in taking direct sums.

Now if we have invented three vectors p, q, r from thin air and want to claim them as a basis for some new vector space, then we take V₃ = V₁⊕V₁⊕V₁ as our new vector space and take <1, 0, 0>, <0, 1, 0>, <0, 0, 1> respectively as p, q and r.

Sub Spaces

If Y is a subset of vector space X and Y is closed under addition and and scalar multiplication then Y is a subspace of X. Intersections of subspaces are also subspaces. The largest subspace of X is X and the smallest is {0}. The space {0} is a subspace of every other subspace. An n dimensional subspace may be determined by n independent vectors in that space that span the space.

A projective geometry of n−1 dimensions may be put into 1-1 correspondence with the subspaces of an n dimensional vector space over the reals. The easy way to do this is to adopt a basis of n vectors of X, choose one of those, x, and consider the set P of all linear combinations of the basis vectors where the coefficient of x is 1. This is not a subspace for it does not contain the vector 0 but it is most of a projective space just as Euclid’s plane is most of 2D projective space.

The field of the reals can be generalized to a finite field thus to attain a finite model of an n−1 dimensional projective space. With a specified basis for X and spanning set for an n dimensional subspace Y, we have n vectors each expressed in coordinate form which form a matrix. The Reduced Row Echelon Form of that matrix is a unique concrete expression of the subspace. With the map to projective spaces those matrices also uniquely identify members of projective spaces.

Dual Spaces
Warning (admission): The proofs here become slightly non trivial and I am too lazy to fill them in, where they are most needed.

An especially interesting class of linear operators are those that map a vector space into the scalars. Collectively this class forms another vector space which is called the “dual” of the first. The dual of space X is written X*. The new operators for this space are defined for f and g in V*:

(f+g)(x) = f(x) + g(x)
(αf)(x) = α(f(x))

We shall prove that X** is not only isomorphic to X but that there is a natural isomorphism and we shall therefore usually confuse X with X**.

∀x (x ∊ X → ∃b (b ∊ X** & ∀p (p ∊ X* → bp = px))) and further the b is unique.
Suppose u, v ∊ X and A, B ∊ X*. We first locate a special element b of X** for any particular element v of X. The following equation defines b:

for every vector p in X*:
bp = px

It must of course be verified that this transformation is indeed linear. We must still show

every element of X** comes this way from some element of X,
distinct elements of X lead to distinct elements of X**.

This is an excellent point for a short excursion thru ‘category theory’. The above works only for finite vector spaces. For Hilbert space the dual space consists of those linear operators that are ‘bounded’ which means they map bounded sets to bounded sets. See Riesz about this. In finite dimensional spaces all operators are bounded.

We shall wander a bit first and discover some new scenery first.

Choose an element b of X**. Choose two elements A and B of X*. bA and bB are two scalars.

Quadratic forms, Infinite dimensional spaces, The Quantum Mechanics Vector Space

To add: It is vital to internalize the relation between a vector space and its dual. It is at the core of understanding tangent bundles and cotangent bundles.