With a title like this, you know that it’s either fantastic, or so pompous that it’s about to get thrashed. Read on to find out which it is.
Title: Linear Algebra Done Right (Second Edition)
Author: Sheldon Axler
Original Publication Date: 1997
ISBN: 0-387-98258-2 paperback, 0-387-98259-0 hardcover
Cover Price: None; Amazon lists it for about $30US in paperback and $70US in hardcover. Note that I bought the paperback because it was cheaper, and ended up regretting it. More on that when I talk about the structure of the text, and get into the Low Point.
Hardcover purchase links: Amazon.com
Paperback purchase links: Amazon.com or Amazon.ca
This covers the material usually in a second undergraduate linear algebra course. If you’ve had any formal mathematical courses which prefer abstraction (ie. “x+y = y+x” is much more interesting than “1+2=3”) then you’ve got enough basis to pick this up and start reading from page one. If you don’t have that, but you’ve got high school math, then you should read chapter four first, and then go back and read the rest. (Chapter four takes the formal approach to polynomials, objects you’re probably quite comfortable with if you’ve chosen to read this review. This should be enough basis to then go back and read the text from the start.) I’d suggest reading this before taking your first linear algebra course for reasons best discussed after I list the chapters. The reasons for this may not make a lot of sense if you haven’t taken a linear algebra course, as they deal with the content of the course without actually defining many of the terms involved. If you haven’t taken the course yet, trust me when I say that the textbook has an extremely appropriate title, and then jump ahead to the individual category scores and discussions to convince yourself that this is the textbook you should buy to replace or supplement the one your class is actually using/planning to use.
The chapter list is as follows:
- Vector Spaces – an introduction to the basic objects of linear algebra.
- Finite-Dimensional Vector Spaces – Going into more detail in the type of vector spaces this text deals with. (By the way, if anyone can recommend a good text that delves into the infinite dimensional cases, I’d like to know about it.)
- Linear Maps – Getting into the meat of things, letting us know what we can do with these vector spaces we’ve just created.
- Polynomials – A review of a topic that should be familiar to anyone picking up this book.
- Eigenvalues and Eigenvectors – if you’re an engineer or scientist who is being forced to take linear algebra, pay attention in this chapter. This is the critical chapter for solving complex systems, and is absolutely mandatory for working in some areas of science, such as quantum mechanics.
- Inner-product Spaces – Generalising the dot product and integrals of products, as well as other constructions we scientists may not be as familiar with.
- Operators on Inner-product Spaces – Operators are far more useful than many of my classmates realized. In fact, they are the main reason I was the only student to finish my fourth year quantum mechanics midterm within the allotted time, and I was the only student to finish it at all. If you work with math, learn these, whether your boss/university program makes you or not.
- Operators on Complex Vector Spaces – Again with the operators.
- Operators on Real Vector Spaces – Are you convinced that these things are useful yet?
- Trace and Determinant – Yes, that’s right. Determinants are covered in the last chapter of the entire book.
Those of you who have taken linear algebra will likely be surprised to see determinants show up that late. It seems as though my experiences were typical: the first semester of linear algebra introduced matrix multiplication and inversion in the first two weeks with little indication of what these things were useful for, and then followed up with three months worth of inverses, determinants, and row reduction, most of which felt like large numbers of simple calculations that didn’t seem to have any real point or benefit. The determinant itself felt, to me at least, like some arbitrarily defined algorithm that happened to turn out a useful number. When I say “useful,” I mean that we could classify all matrices as having either a determinant of 0, or a determinant of anything but 0. I honestly didn’t care if the determinant was 5 or 5000 as long as it wasn’t 0. Later (meaning a few weeks ago, when I read this, about a decade after taking that course) I learned of the volume interpretation of the determinant. That helped, as I suddenly had a meaning to associate with the determinant that cared about the difference between a determinant of 5 and a determinant of 5000. Still, it wasn’t a complete solution for me, as I didn’t think I should need to turn to geometry to lend meaning to a linear algebra quantity.
This textbook turns this around completely for me. It does deal with the volume interpretation in the last few pages, but that’s not the main interpretation of the determinant. Instead, the determinant is given a meaning I should have realized the first time I saw Jordan canonical form: the determinant of a matrix is the product of its eigenvalues. This is a completely developed meaning that works for me. With that meaning in place, I suddenly expect the determinant to have special meaning when it’s zero, and expect it to be distributive and commutative over matrix products even when those products are not commutative themselves. More importantly, that seemingly arbitrary algorithm is motivated, explained, and then set aside as something to be used only when you can’t come up with the eigenvalues directly. This is just as it should be, and as it usually is once you take a second course in linear algebra. If you pick this up, you might actually want to take a second linear algebra course after the drudgery of the first.
Treating the determinants with all the attention they deserve.
The lack of names for so many results in this subject makes it hard to follow several discussions. Even worse, the nature of the subject means that the references could be going back to any of the previous chapters, as there are just so many early results that are still useful as you press onward. As a result, this (or any linear algebra text that takes an abstract approach) often makes references to results and theorems from several chapters in a single proofs, and can make those references only by number. I bought the paperback to save money, but given the amount of cross-referencing that needs to be done back and forth throughout the text just to follow the proofs, I regret not spending over twice as much on the hardcover. While I see no evidence of damage to the binding, I think that such damage may be inevitable when it’s bound by pressure and glue. If this was my first introduction to the subject, I’d have done a lot more page-flipping, too, which would make this a whole lot worse. I’d have also spent more time on each page, and more time on the exercises, just generally increasing the wear and tear. This could be alleviated in large degree if each chapter began with a quick summary of previous results that will come up in the next few pages. Not only would it improve the longevity of the binding, but it would give the student a sort of advance organizer, allowing for a better anticipation of the general direction of the proofs, which would in turn make the proofs quicker to follow and easier to read.
The clarity of the text is quite good. Rather than the usual structure of an abstract math textbook (with a few words scattered between lines and lines of algebra), Axler will make a point of including several paragraphs in the body of the text, motivating the future calculations and results. This feels more like a transcript of the course lectures than it does like an abstract text, and yet the formality of the proofs is there. (Well, there are some exceptions in which results from outside linear algebra are used in a less formal manner, but they are clearly marked with enough information for the doubtful reader to go confirm the statements him/herself.) The relegation of determinants until the end has greatly helped this, as the bloody things actually seem necessary by the time the reader gets to them. The only drawback is that the history of the subject means many results aren’t named, so we’re referring to things like “5.32” more often that “the Cayley-Hamilton theory.” I give it 4 out of 6.
The structure was well formed. The “summary of previous results” section I mentioned above would be useful, but I can’t think of a single textbook I own (and they are legion) that actually has one. As such, I’m not sure the author can really be faulted for not including one. The rest of the textbook is well structured. Many textbooks leave the proof of some results for later in the text, which means building on ground that seems shaky for a while. (In some cases, it feels shaky for me after I finish the text, as I’m never quite sure if the proof of the assumed result depends on the proofs of the results it was assumed for.) Topics are rarely left until later. The introduction mentions that determinants are left for the end, as it should. The other instances are more along the lines of “we’ll deal with the case involving real numbers after the case involving complex numbers,” or “we’ll get to the proof of A, but we’ll need to use result B to complete that proof, so let’s prove B now.” Both of these cases are the kind that should be present. Not only are required results in place before the other proofs occur, but we know why we seem to be detouring, and when we’ll return to the other case. I give it 6 out of 6.
The examples were minimal. There was usually one example for each topic that was hard to grasp in abstraction, and the examples are usually to be completed by the reader. (In other words, an abstract property is introduced, and is followed by a comment like “the matrix < insert specific matrix here> has the property just discussed, as the reader should verify.” If the reader doesn’t actually understand the subject, he or she will recognize this when they fail to verify the property, but he or she will be lacking a worked example to catch any errors that may have been made. As such, the examples are often more like integrated exercises. I give it 3 out of 6.
The exercises themselves are well designed, without the usual computational grunt work. Many of the exercises are proofs that lead to deeper understandings of several secondary topics. In other words, even the exercises are instructive, which is as they should be. Solutions are not provided to the general public, but an answer key is available to those who are teaching from the text for a course. Once again, this is not ideal for those who are reading the textbook outside of a traditional classroom. I give it 5 out of 6.
The text seemed complete in that it covered what was covered in my second linear algebra course, along with the additions of complex vector spaces and the volume interpretation of the determinant. It didn’t go beyond that, though, which again means it’s better suited to a traditional classroom than to self-study. Still, it never claims to be anything else. I give it 5 out of 6.
The editing in this was well done. I didn’t notice any errors or typos of any kind (though I admit I didn’t do all of the exercises.) The author’s website mentions that there was a seventh corrected printing in 2004, so it’s possible that they’ve all been weeded out. I give it 6 out of 6.
Overall, it’s the most highly motivated introduction to the subject I’ve seen. It’s clear, concise, and it presents the material in a way that truly sets this text apart from others in the field. The title really does fit; if you need an introduction to this subject, this is it. I give it 5 out of 6.
In total, Linear Algebra Done Right receives 34 out of 42.
UPDATED Feb. 19, 2006 at 9:39am
Originally I compared the meaning of “5.32” to the meaning of “the Cauchy-Hamilton” theory, stating that the second was more meaningful. Since there is no Cauchy-Hamilton theory, that’s obviously nonsense. I’ve changed it to the “Cayley-Hamilton” theory, though I could have chosen the “Cauchy-Schwartz inequality” instead. The point I was trying to get across is that names have meanings and contexts that don’t depend on the particular text one works from, while numbers depend entirely on the text one works from.