Monday, March 31, 2025

SAXPY and GAXPY: The Algebra Behind Modern AI

At the Department of Information, Electronics and Telecommunications Engineering (DIET) at Sapienza University of Rome we fondly remember Prof. Elio Di Claudio, full professor of Circuit Theory and Master's degree courses, who passed away too soon. In the course "Sensor Arrays" he used to strongly advise students to study the so-called "Matrix computation", a very influential book in numerical algebra from 1983, by Gene H. Golub and Charles F. Van Loan. In the vast landscape of artificial intelligence and deep learning, it's easy to overlook the foundational algorithms that silently power even the most advanced models. But beneath every Transformer, every Large Language Model, and every GPU-accelerated training loop, there are humble, decades-old operations still doing the heavy lifting. Among these are SAXPY and GAXPY — terms that may sound obscure today for non-specialists, but which remain essential even in the age of ChatGPT and GPT-4.

"Matrix computation" remains an essential guide today in low-level routines that require advanced algebraic calculations and Machine Learning is full of it, as is generative AI. Hence, this book remains a reference point for anyone working with matrix algorithms, from theoretical researchers to engineers designing high-performance scientific computing systems.

Let's do a very brief review on SAXPY and GAXPY operations.

SAXPY stands for Scalar A times X Plus Y, and it's a simple vector update operation. Formally, it computes:

y:=αx+y

Where x and y are vectors of the same length, and α is a scalar. In component form:

yi:=αxi+yifor all i

This operation appears in Level 1 of the BLAS (Basic Linear Algebra Subprograms), and expresses one of the most frequent patterns in linear algebra: updating a vector based on another scaled vector. Here’s a basic Python implementation using NumPy:

import numpy as np

alpha = 2.0
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])

y = alpha * x + y
print(y)  # Output: [6. 9. 12.]

GAXPY, on the other hand, generalizes this idea to matrices. It stands for Generalized A times X Plus Y and describes a column-oriented approach to matrix-vector multiplication. Instead of computing the dot product of each row of a matrix A with a vector x, as in the standard GEMV approach, GAXPY computes:

y:=jxjaj

Where aj is the j-th column of the matrix A, and xj is the j-th component of the vector x. Each iteration is essentially a SAXPY operation. Here's a small Python example:

A = np.array([[1, 2],
              [3, 4],
              [5, 6]])

x = np.array([10, 20])
y = np.zeros(3)

for j in range(len(x)):
    y += x[j] * A[:, j]

print(y)  # Output: [ 50. 110. 170.]

Now, you might be wondering: what do these old operations have to do with modern deep learning models like Transformers or GPT? The answer is—everything.

At the heart of every neural network layer, especially in Transformers, lie massive matrix multiplications. The attention mechanism alone involves computing QKT, applying softmax, and then computing the result of that with V. These are all dense matrix-matrix or matrix-vector multiplications. When we train models, gradients are computed via backpropagation, and the parameter updates—such as in SGD or Adam—apply operations that are, at their core, vector updates of the form:

θ:=θαθL

Which is essentially a SAXPY operation again. Deep learning frameworks like PyTorch, TensorFlow, and JAX don't expose SAXPY directly, but they all build on top of libraries that implement it. Under the hood, PyTorch uses cuBLAS for NVIDIA GPUs and MKL or OpenBLAS for CPUs. These libraries include high-performance versions of SAXPY, GEMV, GEMM, and related routines. These are the building blocks of every forward and backward pass in neural networks.

On GPUs, especially when training large models, these operations are optimized using techniques like kernel fusion. A single SAXPY might not be efficient on its own because it’s memory-bound, but when fused with other operations, or applied over millions of parameters in parallel, it becomes incredibly effective. Libraries like cuBLAS, XLA (in JAX), and Triton (used by some PyTorch kernels) apply massive parallelism and scheduling strategies to run these operations efficiently on thousands of GPU cores.

So even though today’s machine learning models deal with billions of parameters and require massive compute, the core operations remain surprisingly simple. The genius lies in the layering, optimization, and orchestration—not in reinventing the algebra.

To understand modern AI, it’s worth remembering that every Transformer is still built on operations described in Matrix Computations. SAXPY and GAXPY are not relics of the past; they are the silent workhorses of today’s AI revolution.

As Golub and Van Loan reminded us decades ago, understanding these basic patterns is not only useful—it's essential. Because while the models have changed, the math hasn't.

---

This post is written in memory of Prof. Elio Di Claudio for not having been able to see the wonders of algebraic and matrix techniques in LLMs and generative AI, as he passed away a few months before the fateful November 30, 2022, the launch date of ChatGPT to the general public. I am sure that he would have studied and known these architectures perfectly and would have been available, as always, to systematize his already extensive technical knowledge.

2025 - Studies in Computational Intelligence (SCI, volume 1196, Springer) finally published

SCI, volume 1196, Springer
 

Finally the collection "Studies in Computational Intelligence (SCI, volume 1196)" has been published, concerning the book chapter extensions of our works selected by IJCCI: International Joint Conference on Computational Intelligence (2022).

Our contributions concern the context of energy sustainability, Smart Grids and Renewable Energy Communities, in particular in modeling and control techniques and energy forecasting.

Specifically the two study are the following:

1) Antonino Capillo, Enrico De Santis , Fabio Massimo Frattale Mascioli , and Antonello Rizzi, On the Performance of Multi-Objective Evolutionary Algorithms for Energy Management in Microgrids

Abstract. In the context of Energy Communities (ECs), where energy flows among PV generators, batteries and loads have to be optimally managed not to waste a single drop of energy, relying on robust optimization algorithms is mandatory. The purpose of this work is to reasonably investigate the performance of the Fuzzy Inference System-Multi-Objective-Genetic Algorithm model (MO-FIS-GA), synthesized for achieving the optimal Energy Management strat-egy for a docked e-boat. The MO-FIS-GA performance is compared to a model composed of the same FIS implementation related to the former work but opti-mized by a Differential Evolution (DE) algorithm – instead of the GA – on the same optimization problem. Since the aim is not evaluating the best-performing optimization algorithm, it is not necessary to push their capabilities to the max. Rather, a good meta-parameter combination is found for the GA and the DE such that their performance is acceptable according to the technical literature. Results show that the MO-FIS-GA performance is similar to the equivalent MO-FIS-DE model, suggesting that the former could be worth developing. Further works will focus on proposing the aforementioned comparison on different optimiza-tion problems for a wider performance evaluation, aiming at implementing the MO-FIS-GA on a wide range of real applications, not only in the nautical field.


2) Sabereh Taghdisi Rastkar , Danial Zendehdel , Antonino Capillo , Enrico De Santis, and Antonello Rizzi, Seasonality Effect Exploration for Energy Demand Forecasting in Smart Grids

Abstract. Effective energy forecasting is essential for the efficient and sustain-able management of energy resources, especially as energy demand fluctuates significantly with seasonal changes. This paper explores the impact of seasonal-ity on forecasting algorithms in the context of energy consumption within Smart Grids. Using three years of data from four different countries, the study evaluates and compares both seasonal models – such as Seasonal Autoregressive Integrated Moving Average (SARIMA), Seasonal Long Short-Term Memory (Seasonal-LSTM), and Seasonal eXtreme Gradient Boosting (Seasonal-XGBoost) – and their non-seasonal counterparts. The results demonstrate that seasonal models outperform non-seasonal ones in capturing complex consumption patterns, offer-ing improved accuracy in energy demand prediction. These findings provide valu-able insights for energy companies or in the design of intelligent Energy Manage-ment Systems, suggesting optimized strategies for resource allocation and under-scoring the importance of advanced forecasting methods in supporting sustain-able energy practices in urban environments. 


BibTex book citation:

@book{back2025computational,
  editor    = {Thomas B{\"a}ck and Niki van Stein and Christian Wagner and Jonathan M. Garibaldi and Francesco Marcelloni and H. K. Lam and Marie Cottrell and Faiyaz Doctor and Joaquim Filipe and Kevin Warwick and Janusz Kacprzyk},
  title     = {Computational Intelligence: 14th and 15th International Joint Conference on Computational Intelligence (IJCCI 2022 and IJCCI 2023) Revised Selected Papers},
  year      = {2025},
  publisher = {Springer},
  series    = {Studies in Computational Intelligence},
  volume    = {1196},
  doi       = {10.1007/978-3-031-85252-7}
}
 

Single chapter BibTex references:

@incollection{chapter1IJCCI2025,
  author    = {Author Name},
  title     = {Computational Intelligence: 14th and 15th International Joint Conference on Computational Intelligence (IJCCI 2022 and IJCCI 2023) Revised Selected Papers},
  booktitle = {Computational Intelligence},
  editor    = {Thomas B{\"a}ck and Niki van Stein and Christian Wagner and Jonathan M. Garibaldi and Francesco Marcelloni and H. K. Lam and Marie Cottrell and Faiyaz Doctor and Joaquim Filipe and Kevin Warwick and Janusz Kacprzyk},
  publisher = {Springer},
  year      = {2025},
  chapter   = {1},
  pages     = {1--10},
  doi       = {10.1007/978-3-031-85252-7_1}
}
 

@incollection{rastkar2025seasonality,
  author    = {Sabereh Taghdisi Rastkar and Danial Zendehdel and Antonino Capillo and Enrico De Santis and Antonello Rizzi},
  title     = {Seasonality Effect Exploration for Energy Demand Forecasting in Smart Grids},
  booktitle = {Computational Intelligence: 14th and 15th International Joint Conference on Computational Intelligence (IJCCI 2022 and IJCCI 2023) Revised Selected Papers},
  editor    = {Thomas B{\"a}ck and Niki van Stein and Christian Wagner and Jonathan M. Garibaldi and Francesco Marcelloni and H. K. Lam and Marie Cottrell and Faiyaz Doctor and Joaquim Filipe and Kevin Warwick and Janusz Kacprzyk},
  publisher = {Springer},
  year      = {2025},
  series    = {Studies in Computational Intelligence},
  volume    = {1196},
  pages     = {211--223},
  doi       = {10.1007/978-3-031-85252-7_12},
  url       = {https://link.springer.com/chapter/10.1007/978-3-031-85252-7_12}
}
 



SAXPY and GAXPY: The Algebra Behind Modern AI

At the Department of Information, Electronics and Telecommunications Engineering (DIET) at Sapienza University of Rome we fondly r...