Week 275: Unquantifiable Unease

What we do to prove ourselves

Jun 19, 2025

I’ve been meditating on a point of chronology for the past week or so: I’m about at the age when my father just pushed away from computer technology, which at the time was right as the internet became a thing. For him this was striking because the entire basis of his work from the time I was born to then was industrial computing. He sold, installed, and repaired large industrial memory boards for special systems, got out of that particular product line when PC’s were about to take over and standardize the peripherals, and then right at this point we’re talking about, he was working with a Unix-based product to connect digital sensors and actuators to control industrial plants from a single machine. He sold that business back to the guys he bought it from, because it wouldn’t work right going to one central machine.

This was about six months before the internet became the internet, and networking computers really took off. Looking back now, the model of this product wasn’t right. What it needed to be was a couple of computers splitting the load of the whole factory’s control system input and output and then sending stuff to central. But I think this was the point where my dad lost interest because it was a technology different from what he was expecting. You’d have expected him to be right in the middle of it from ‘89 to ‘96, it was a really interesting point of inflection for the industry he knew, and like the ‘83 to ‘85 standardization around PC’s, he saw to get out of the wrong path. But in ‘85 he ran to the right path, and in ‘89 he just kind of stopped.

That was the preface, the main thing here is that I was asked again about AI in quiz bowl. And as you’ve read here before, I remain very, very wary about it, and the problem it has with hallucination, and my personal analysis that (at the risk of personifying it) AI tends to be a people pleaser, and so it’s far too willing to create an answer, when an “I don’t know” should be employed. In meditating on this point, I came up with the crux of my unease with AI, and it involved not quiz bowl, but my day job, working with simulation and finite element analysis.

Share Holed up with a buzzer

A quick trip through finite element analysis and simulation:

If you’ve ever done a mechanics or physics course, you would probably be familiar with the concept of a spring-mass-damper system.

Spring-Mass-Damper
The usual lab with the spring mass damper is to drive it with a force or a displacement or velocity or acceleration at t=0 and you can then work out the position the system takes for all subsequent times because there’s well defined mathematical formulas that determine that. Once you have worked out that, you can consider a system of two linked spring-mass-damper systems, and the two equations that link their positions. Then you can get to n spring-mass-dampers, apply them in different directions, then apply different sorts of conditions, and material properties, and ultimately develop a complex system of spring-mass-dampers that models a physical object. The general equation doesn’t change, just the number of equations involved. Finite element methods just take an object, break it into a bunch of spring-mass-damper blocks, and creates an equation for each block, which the software endeavors to solve. That conversion is not always exact, any curved surface gets linearized as finite elements, most materials don’t have a purely linear spring constant, etc.; those deviations from perfect systems introduce small errors in the calculation. When Ansys, the day job, was founded this was cutting edge technology that required high-powered computing to solve that would take humans lots of time doing repetitive calculations.

Ansys creates simulations of physical processes to solve problems in engineering analysis. There was a time when finite element methods were the unproven technology, and to prove the technology, we had a verification manual of hundreds of situations where we solved the original physics problem by traditional means with an exact solution, and then created the simulation with finite element methods, and determined the error from the exact solution.

My objection to AI chiefly is that it's not going through the same level of scrutiny and verification as I would expect for a process that design and safety must rely on. It's being used and taken as gospel without ever having to develop a set of Verification exercises for the tasks it's being applied to. And most AI has an emergent goal is to please the user, to come up with a solution. Sometimes it should fail to generate a solution.

It’s also being used without users being able to consider how wrong it can get. When I mentioned the verification manual, you had the mathematical solution right there, and the finite element solution right there, and next to those two points of data, the error calculated from those two points of data. I know there are benchmarks for AI that they point to as their error percentages, when they fail to reach the correct answer, but the public does not delve into the details of those benchmarks, and those are sums of trials. What we don’t get back from our single use of an AI to answer a question is the percentage that it calculated the answer to be correct. We can’t pull a quantified assessment of its own error, and that unquantifiability is one of the things that makes me uneasy.

The benchmarks are also giving a false sense of security in that they limit the subject of the questions posed to the AI, and then the users take that test to apply universally. That isn’t on the AI, that’s on us. The verification manual for our customers indicated the features tested by it, (In the case of my tests, radiative heat transfer in 2D and 3D using the then just developed RSurf251 and RSurf252 elements) and that was all that my verification tests did. The verification was limited to stating that feature’s safety and accuracy, but it didn’t suggest you could use it for additional situations beyond the scope of the element. For instance, it ignored the elements between the surface elements and had no effect on them. We specified that, and made sure the user would know that was an untested usage. In contrast, most AI is tested only for certain situations and treated by the public as universally applicable. It’s the difference between testing the product in-house versus making all users pre-alpha testers. If you’re using it without treating the results with complete suspicion, you’re putting the risk on yourself.

So to come back to the preface of this, I started to realize that I’m facing the same sort of idea that my dad did. I’m going to be forced to consider the utility of AI, and come to grips with knowing how much I can trust it. The difference is I don’t think my dad’s option of keeping hands clean of it, pushing it away, and retiring from the field is an option for me.

I’m uneasy about this whole thing because I know it’s coming and I know it’s going to affect things in my day job, but it’s not that the AI might replace me, it’s that those who trust blindly in the AI might replace me in the role of its examiner.

On a parallel track, I’m starting to think that AI will find lots of problems with quiz bowl questions. Not in answering them, but in constructing them.

It would be possible for AI to create a question which leads the player to the answer in a way equivalent to human logic and intuitions, but the likelihood of that diminishes with the number of lexical tokens in the question. Simply the longer the path to the answer, the more things that have to interact correctly. As we’ve shown before the simplest one-fact, one-sentence questions would be an easy task, so easy it doesn’t require AI or LLM to facilitate the automation of the process.

It’s not that it can’t present the facts, or even that it will have to present the facts in a pyramidal order, which is also makes a grammatically correct sentence, it’s that it will have to move the human mind from partial knowledge to precise knowledge while doing all that. And then it must do that reliably, for all subject matter. That seems to be a hard problem, that while wouldn’t be unsolvable, that type of sparking of the human reaction is really, really specific, and might not be worth tasking a general AI to solve. It might actually be too expensive to run the human subjects through it to generate the data.

Last Sunday was Father’s Day, my first since getting the job on a full-time basis. Last year for Father’s Day, Catie went over to the hospital to see John, whose fall and scratched cornea started the long decline. She was visibly rattled by even dealing with me on this day. That was even before my suggestion that they go see him was shot down. It was a foolish idea on my part, we haven’t put the memorial stone on it, so we don’t really know where his remains are buried. She’s still got a some scar tissue from where he was. I want it to heal, I don’t want to insert myself in the hole that’s there.

As for what started the preface, we brought my dad home again. While he seemed a bit more lucid when fully awake, you can tell that he’s not got as much energy to stay that way. He took two naps in his visit, and didn’t really fight going back to the facility. I’m not sure if that’s acceptance or just realism.