Blog
← Back to blogWhy I keep a reference implementation
A reference implementation is less about purity than about preserving the shape of the answer while the code changes form.
Most of the work I care about does not stay in one language for long.
An idea usually starts in a high-leverage environment: MATLAB, a notebook, or some other analysis-friendly place where the first question is “is this the right idea?” instead of “is this the final form?” That first version is not the product. It is the place where the design earns the right to exist.
But once the idea survives contact with data, it has to move. Sometimes that means VHDL. Sometimes it means real-time C on hardware that does not have much patience for abstraction. In either case, the risk is the same: the implementation starts changing shape for good local reasons, and after enough small changes, nobody can say with confidence whether the system is still computing the same thing.
That is why I keep a reference implementation around.
A reference implementation is not there to win benchmarks. It is there to act as the stable answer while everything else changes form around it. When the embedded build, the FPGA block, or the integration environment starts behaving strangely, the reference gives me somewhere to stand. If the outputs diverge, I can ask a concrete question: is the problem in the algorithm, in the translation, or in the surrounding system?
This sounds obvious, but it changes the day-to-day work more than people expect.
First, it makes debugging compositional. Instead of arguing from intuition, I can compare stages. Did the acquisition metric move? Did the loop state move? Did quantization or scaling change something that looked harmless in review? A good reference narrows the search.
Second, it changes how I write the prototype. If I know the design will eventually land in constrained hardware or real-time software, I try to make the reference implementation honest early. I still want the speed of iteration, but I do not want a prototype that hides the eventual cost model so thoroughly that the translation becomes a second design exercise.
Third, it makes conversations with teammates cleaner. “The implementation is wrong” is not very helpful. “The implementation diverges from the reference after this stage, under these inputs” is much better. The second version gives everybody something testable.
I do not mean that a reference implementation is sacred. Sometimes the translation reveals that the first model was incomplete or too convenient. That is good news. It means the design learned something real. But if the reference changes, I want it to change deliberately, for reasons I can explain, not because the implementation wandered and the explanation arrived later.
For me, the reference implementation is really a discipline of memory. It keeps the work from becoming a series of loosely related rewrites. It preserves the shape of the answer while the code, hardware, and system context keep changing.