Monday, March 31, 2025

SAXPY and GAXPY: The Algebra Behind Modern AI

At the Department of Information, Electronics and Telecommunications Engineering (DIET) at Sapienza University of Rome we fondly remember Prof. Elio Di Claudio, full professor of Circuit Theory and Master's degree courses, who passed away too soon. In the course "Sensor Arrays" he used to strongly advise students to study the so-called "Matrix computation", a very influential book in numerical algebra from 1983, by Gene H. Golub and Charles F. Van Loan. In the vast landscape of artificial intelligence and deep learning, it's easy to overlook the foundational algorithms that silently power even the most advanced models. But beneath every Transformer, every Large Language Model, and every GPU-accelerated training loop, there are humble, decades-old operations still doing the heavy lifting. Among these are SAXPY and GAXPY — terms that may sound obscure today for non-specialists, but which remain essential even in the age of ChatGPT and GPT-4.

"Matrix computation" remains an essential guide today in low-level routines that require advanced algebraic calculations and Machine Learning is full of it, as is generative AI. Hence, this book remains a reference point for anyone working with matrix algorithms, from theoretical researchers to engineers designing high-performance scientific computing systems.

Let's do a very brief review on SAXPY and GAXPY operations.

SAXPY stands for Scalar A times X Plus Y, and it's a simple vector update operation. Formally, it computes:

y:=αx+y

Where x and y are vectors of the same length, and α is a scalar. In component form:

yi:=αxi+yifor all i

This operation appears in Level 1 of the BLAS (Basic Linear Algebra Subprograms), and expresses one of the most frequent patterns in linear algebra: updating a vector based on another scaled vector. Here’s a basic Python implementation using NumPy:

import numpy as np

alpha = 2.0
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])

y = alpha * x + y
print(y)  # Output: [6. 9. 12.]

GAXPY, on the other hand, generalizes this idea to matrices. It stands for Generalized A times X Plus Y and describes a column-oriented approach to matrix-vector multiplication. Instead of computing the dot product of each row of a matrix A with a vector x, as in the standard GEMV approach, GAXPY computes:

y:=jxjaj

Where aj is the j-th column of the matrix A, and xj is the j-th component of the vector x. Each iteration is essentially a SAXPY operation. Here's a small Python example:

A = np.array([[1, 2],
              [3, 4],
              [5, 6]])

x = np.array([10, 20])
y = np.zeros(3)

for j in range(len(x)):
    y += x[j] * A[:, j]

print(y)  # Output: [ 50. 110. 170.]

Now, you might be wondering: what do these old operations have to do with modern deep learning models like Transformers or GPT? The answer is—everything.

At the heart of every neural network layer, especially in Transformers, lie massive matrix multiplications. The attention mechanism alone involves computing QKT, applying softmax, and then computing the result of that with V. These are all dense matrix-matrix or matrix-vector multiplications. When we train models, gradients are computed via backpropagation, and the parameter updates—such as in SGD or Adam—apply operations that are, at their core, vector updates of the form:

θ:=θαθL

Which is essentially a SAXPY operation again. Deep learning frameworks like PyTorch, TensorFlow, and JAX don't expose SAXPY directly, but they all build on top of libraries that implement it. Under the hood, PyTorch uses cuBLAS for NVIDIA GPUs and MKL or OpenBLAS for CPUs. These libraries include high-performance versions of SAXPY, GEMV, GEMM, and related routines. These are the building blocks of every forward and backward pass in neural networks.

On GPUs, especially when training large models, these operations are optimized using techniques like kernel fusion. A single SAXPY might not be efficient on its own because it’s memory-bound, but when fused with other operations, or applied over millions of parameters in parallel, it becomes incredibly effective. Libraries like cuBLAS, XLA (in JAX), and Triton (used by some PyTorch kernels) apply massive parallelism and scheduling strategies to run these operations efficiently on thousands of GPU cores.

So even though today’s machine learning models deal with billions of parameters and require massive compute, the core operations remain surprisingly simple. The genius lies in the layering, optimization, and orchestration—not in reinventing the algebra.

To understand modern AI, it’s worth remembering that every Transformer is still built on operations described in Matrix Computations. SAXPY and GAXPY are not relics of the past; they are the silent workhorses of today’s AI revolution.

As Golub and Van Loan reminded us decades ago, understanding these basic patterns is not only useful—it's essential. Because while the models have changed, the math hasn't.

---

This post is written in memory of Prof. Elio Di Claudio for not having been able to see the wonders of algebraic and matrix techniques in LLMs and generative AI, as he passed away a few months before the fateful November 30, 2022, the launch date of ChatGPT to the general public. I am sure that he would have studied and known these architectures perfectly and would have been available, as always, to systematize his already extensive technical knowledge.

2025 - Studies in Computational Intelligence (SCI, volume 1196, Springer) finally published

SCI, volume 1196, Springer
 

Finally the collection "Studies in Computational Intelligence (SCI, volume 1196)" has been published, concerning the book chapter extensions of our works selected by IJCCI: International Joint Conference on Computational Intelligence (2022).

Our contributions concern the context of energy sustainability, Smart Grids and Renewable Energy Communities, in particular in modeling and control techniques and energy forecasting.

Specifically the two study are the following:

1) Antonino Capillo, Enrico De Santis , Fabio Massimo Frattale Mascioli , and Antonello Rizzi, On the Performance of Multi-Objective Evolutionary Algorithms for Energy Management in Microgrids

Abstract. In the context of Energy Communities (ECs), where energy flows among PV generators, batteries and loads have to be optimally managed not to waste a single drop of energy, relying on robust optimization algorithms is mandatory. The purpose of this work is to reasonably investigate the performance of the Fuzzy Inference System-Multi-Objective-Genetic Algorithm model (MO-FIS-GA), synthesized for achieving the optimal Energy Management strat-egy for a docked e-boat. The MO-FIS-GA performance is compared to a model composed of the same FIS implementation related to the former work but opti-mized by a Differential Evolution (DE) algorithm – instead of the GA – on the same optimization problem. Since the aim is not evaluating the best-performing optimization algorithm, it is not necessary to push their capabilities to the max. Rather, a good meta-parameter combination is found for the GA and the DE such that their performance is acceptable according to the technical literature. Results show that the MO-FIS-GA performance is similar to the equivalent MO-FIS-DE model, suggesting that the former could be worth developing. Further works will focus on proposing the aforementioned comparison on different optimiza-tion problems for a wider performance evaluation, aiming at implementing the MO-FIS-GA on a wide range of real applications, not only in the nautical field.


2) Sabereh Taghdisi Rastkar , Danial Zendehdel , Antonino Capillo , Enrico De Santis, and Antonello Rizzi, Seasonality Effect Exploration for Energy Demand Forecasting in Smart Grids

Abstract. Effective energy forecasting is essential for the efficient and sustain-able management of energy resources, especially as energy demand fluctuates significantly with seasonal changes. This paper explores the impact of seasonal-ity on forecasting algorithms in the context of energy consumption within Smart Grids. Using three years of data from four different countries, the study evaluates and compares both seasonal models – such as Seasonal Autoregressive Integrated Moving Average (SARIMA), Seasonal Long Short-Term Memory (Seasonal-LSTM), and Seasonal eXtreme Gradient Boosting (Seasonal-XGBoost) – and their non-seasonal counterparts. The results demonstrate that seasonal models outperform non-seasonal ones in capturing complex consumption patterns, offer-ing improved accuracy in energy demand prediction. These findings provide valu-able insights for energy companies or in the design of intelligent Energy Manage-ment Systems, suggesting optimized strategies for resource allocation and under-scoring the importance of advanced forecasting methods in supporting sustain-able energy practices in urban environments. 


BibTex book citation:

@book{back2025computational,
  editor    = {Thomas B{\"a}ck and Niki van Stein and Christian Wagner and Jonathan M. Garibaldi and Francesco Marcelloni and H. K. Lam and Marie Cottrell and Faiyaz Doctor and Joaquim Filipe and Kevin Warwick and Janusz Kacprzyk},
  title     = {Computational Intelligence: 14th and 15th International Joint Conference on Computational Intelligence (IJCCI 2022 and IJCCI 2023) Revised Selected Papers},
  year      = {2025},
  publisher = {Springer},
  series    = {Studies in Computational Intelligence},
  volume    = {1196},
  doi       = {10.1007/978-3-031-85252-7}
}
 

Single chapter BibTex references:

@incollection{chapter1IJCCI2025,
  author    = {Author Name},
  title     = {Computational Intelligence: 14th and 15th International Joint Conference on Computational Intelligence (IJCCI 2022 and IJCCI 2023) Revised Selected Papers},
  booktitle = {Computational Intelligence},
  editor    = {Thomas B{\"a}ck and Niki van Stein and Christian Wagner and Jonathan M. Garibaldi and Francesco Marcelloni and H. K. Lam and Marie Cottrell and Faiyaz Doctor and Joaquim Filipe and Kevin Warwick and Janusz Kacprzyk},
  publisher = {Springer},
  year      = {2025},
  chapter   = {1},
  pages     = {1--10},
  doi       = {10.1007/978-3-031-85252-7_1}
}
 

@incollection{rastkar2025seasonality,
  author    = {Sabereh Taghdisi Rastkar and Danial Zendehdel and Antonino Capillo and Enrico De Santis and Antonello Rizzi},
  title     = {Seasonality Effect Exploration for Energy Demand Forecasting in Smart Grids},
  booktitle = {Computational Intelligence: 14th and 15th International Joint Conference on Computational Intelligence (IJCCI 2022 and IJCCI 2023) Revised Selected Papers},
  editor    = {Thomas B{\"a}ck and Niki van Stein and Christian Wagner and Jonathan M. Garibaldi and Francesco Marcelloni and H. K. Lam and Marie Cottrell and Faiyaz Doctor and Joaquim Filipe and Kevin Warwick and Janusz Kacprzyk},
  publisher = {Springer},
  year      = {2025},
  series    = {Studies in Computational Intelligence},
  volume    = {1196},
  pages     = {211--223},
  doi       = {10.1007/978-3-031-85252-7_12},
  url       = {https://link.springer.com/chapter/10.1007/978-3-031-85252-7_12}
}
 



Thursday, January 16, 2025

The Future of Lithium-Ion Battery Diagnostics: Insights from Degradation Mechanisms and Differential Curve Modeling

 

Featured Research paper: Degradation mechanisms and differential curve modeling for non-invasive diagnostics of lithium cells: An overview

De Santis, E., Pennazzi, V., Luzi, M., & Rizzi, A., Renewable and Sustainable Energy Reviews, Volume 211, April 2025

 

As the world pivots towards sustainable energy solutions, lithium-ion batteries (LIBs) have emerged as indispensable components in electric vehicles (EVs) and renewable energy systems. Their efficiency and longevity, however, are hindered by the phenomenon of battery aging — a multifaceted issue tied to the gradual decline in performance and safety. The recent paper, grounding on a project developed with Ferrari S.p.A., "Degradation mechanisms and differential curve modeling for non-invasive diagnostics of lithium cells: An overview" — published on the prestigious journal Renewable and Sustainable Energy Reviews — offers a detailed exploration of the degradation processes in LIBs, introducing innovative diagnostic methodologies and shedding light on future directions for research and industry.

Our research group at CIPARLABS is strongly committed to the development of technologies for energy sustainability. The topic of lithium-ion battery modeling is among the topics under study and development carried out by our laboratory at the "Sapienza" University  of Rome, Department of Information Engineering, Electronics and Telecommunications (DIET). 


The Challenge of Battery Aging

Lithium-ion batteries, the backbone of EVs, offer numerous advantages such as high energy density, lightweight construction, and zero emissions. However, they face significant challenges, particularly the progressive degradation of their components. Battery aging manifests as a decline in capacity, efficiency, and safety, influenced by factors such as temperature extremes, charging rates, and the depth of discharge (DOD). Addressing these issues is critical to optimizing battery performance and aligning with broader environmental goals like the UN Sustainable Development Goals (SDGs).

Battery degradation occurs in two primary forms:

  • Calendar Aging: Degradation during storage, even in the absence of active use, exacerbated by conditions like high temperature and elevated state of charge (SOC).

  • Cycle Aging: Degradation resulting from repetitive charging and discharging cycles.

These processes lead to two key degradation modes:

  • Loss of Lithium Inventory (LLI): A reduction in the cyclable lithium ions due to side reactions.

  • Loss of Active Materials (LAM): Structural damage or dissolution of electrode materials, impacting the battery’s ability to store and deliver energy effectively.

A Diagnostic Revolution: Differential Curve Modeling

The cornerstone of the paper is its focus on differential curve modeling—a non-invasive and powerful tool for diagnosing battery aging. Differential curves, specifically Incremental Capacity (IC) and Differential Voltage (DV) curves, are derived from charge/discharge data. These curves amplify subtle changes in battery behavior, revealing critical insights into degradation mechanisms.

  • Incremental Capacity (IC) Curves: By plotting the change in charge against voltage, IC curves expose phase transitions in electrode materials, which are sensitive to degradation modes.

  • Differential Voltage (DV) Curves: These represent voltage changes relative to charge, offering detailed insights into electrode-specific reactions and transitions.

These curves act as diagnostic fingerprints, capturing the nuanced dynamics of battery aging. For instance, shifts in IC curve peaks or DV curve valleys can be linked to specific degradation processes, enabling precise assessments of battery health.

Bridging Science and Application

The paper highlights the practical potential of differential curve analysis. In the automotive sector, this technique can be integrated into Battery Management Systems (BMS) for real-time monitoring and predictive maintenance. By identifying early signs of aging, manufacturers can optimize charging protocols, enhance safety, and extend battery lifespan. This not only reduces costs but also aligns with sustainability objectives by minimizing waste.

In the energy sector, differential curves can support the management of large-scale energy storage systems, ensuring reliability and efficiency. Policymakers, too, can leverage these insights to refine regulations and standards for EVs, accelerating the transition to sustainable transportation.

Future Directions and Innovations

While differential curve modeling offers substantial promise, challenges remain. Noise sensitivity during data processing and variability in experimental conditions necessitate standardized protocols for broader applicability. The integration of machine learning represents an exciting frontier. By training algorithms on IC/DV curve data, researchers can automate diagnostics, identify anomalous patterns, and predict battery failures with unprecedented accuracy.

The non-destructive nature of this approach makes it particularly appealing. Unlike invasive post-mortem analyses, differential curve modeling preserves battery integrity, offering a cost-effective and scalable solution for both academic and industrial applications.

Conclusion

The insights presented in the paper underscore the transformative potential of advanced diagnostic techniques for lithium-ion batteries. By unraveling the complexities of degradation mechanisms and leveraging differential curve modeling, researchers and industry leaders can pave the way for safer, more efficient, and sustainable energy storage solutions. As the global push for electrification and decarbonization accelerates, such innovations are not just timely but essential.

The road ahead is one of collaboration and innovation, bridging gaps between scientific research, industrial practices, and policy frameworks. With tools like differential curve modeling, we are better equipped to meet the challenges of the energy transition and drive a future powered by clean and reliable energy.

 

Please cite as:

  • APA format:

De Santis, E., Pennazzi, V., Luzi, M., & Rizzi, A. (2025). Degradation mechanisms and differential curve modeling for non-invasive diagnostics of lithium cells: An overview. Renewable and Sustainable Energy Reviews, 211, 115349. https://doi.org/10.1016/j.rser.2025.115349

  • BibTex format:

@article{ENRICO2025115349,
title = {Degradation mechanisms and differential curve modeling for non-invasive diagnostics of lithium cells: An overview},
journal = {Renewable and Sustainable Energy Reviews},
volume = {211},
pages = {115349},
year = {2025},
issn = {1364-0321},
doi = {https://doi.org/10.1016/j.rser.2025.115349},
url = {https://www.sciencedirect.com/science/article/pii/S136403212500022X},
author = {De Santis Enrico and Pennazzi Vanessa and Luzi Massimiliano and Rizzi Antonello},
keywords = {Ageing, Diagnosis, Degradation mechanisms, Degradation modes, Differential curves, Differential voltage, Lithium-ion batteries, Incremental capacity, State of health}
}

 

 

 

 

Monday, January 6, 2025

Large Concept Models: A Paradigm Shift in Language and Information Representation

Left: visualization of reasoning in an embedding space of concepts (task of summarization). Right: fundamental architecture of an Large Concept Model (LCM). ⋆: concept encoder and decoder are frozen.


Large Concept Models (LCMs) – presented by Facebook research [1] – represent a transformative leap in the field of natural language processing, aiming to address limitations in traditional Large Language Models (LLMs) by operating on a higher semantic plane. While LLMs focus on token-level processing, predicting the next word or subword in a sequence, LCMs abstract their operations to the level of "concepts." These concepts are language- and modality-agnostic, encapsulating the semantic essence of sentences or higher-order units of meaning, which positions LCMs as closer analogs to human reasoning.

Traditional LLMs have demonstrated unparalleled success across a range of tasks, from text generation to multimodal applications like image captioning or code synthesis. However, their reliance on token-level prediction imposes a fundamental limitation. Tokens capture syntactic, low-level structure but do not inherently represent the semantic relationships or higher-order dependencies humans naturally process. LCMs, by contrast, explicitly model these relationships, leveraging conceptual units that enable reasoning and planning over long sequences or across complex ideas.

In this context, LCMs align well with the broader theoretical framework of language and information representation. At the heart of this shift lies the notion of concepts as high-level semantic units. These units, often corresponding to sentences, abstract away from the specifics of language and modality, providing a universal representation of meaning. This abstraction is vital for tasks requiring multilingual and multimodal capabilities, as it decouples the reasoning process from the constraints of individual languages or data types. For example, by operating in a sentence embedding space like SONAR, which supports multiple languages and modalities, LCMs can seamlessly perform zero-shot generalization, extending their reasoning capabilities to unseen languages or domains without additional fine-tuning. This reimagining of neural networks emphasizes the hierarchical and structured nature of human cognition, where meaning is not a property of individual words but emerges from their relationships within larger contexts.

This conceptual foundation brings several advantages. First, LCMs inherently handle longer contexts more efficiently. Unlike token-based transformers, whose computational complexity scales quadratically with sequence length, LCMs process much shorter sequences of embeddings. This not only reduces resource demands but also enables more effective modeling of dependencies over extended sequences. Additionally, by decoupling reasoning from language or modality, LCMs achieve unmatched flexibility, allowing for local interactive edits, seamless multilingual integration, and the potential to extend into novel modalities like sign language. Furthermore, the explicit hierarchical structure of LCMs facilitates coherent long-form generation, making them particularly suited for tasks like summarization, story generation, or cross-lingual document alignment.

However, LCMs are not without limitations. Their reliance on pre-trained embedding spaces like SONAR introduces a dependency that may constrain performance when embeddings do not perfectly align with the model’s reasoning architecture. While the abstraction to sentence-level concepts is a significant step forward, further extending this abstraction to paragraphs or sections remains an area of ongoing research. Additionally, LCMs currently lag behind LLMs in tasks requiring fine-tuned instruction-following, such as creative writing or detailed conversational exchanges. This gap highlights the need for better integration of low-level token dynamics with high-level conceptual reasoning.

The architecture of LCMs is pivotal in achieving their ambitious goals. The process begins with an encoder, typically leveraging a pre-trained system like SONAR, which maps sentences into a fixed-dimensional semantic embedding space. These embeddings form the foundation for the LCM core – a transformer-based model that predicts the next embedding in an autoregressive manner. The core transforms and processes embeddings through layers of attention and feed-forward mechanisms, using additional components like PreNet and PostNet to normalize and denormalize embeddings to match the transformer’s internal dimensions. Variants of the core architecture explore different strategies for embedding prediction. For example, diffusion-based models introduce noise to embeddings during training and iteratively denoise them during inference, while quantized models discretize the embedding space into learned centroids for efficient prediction. Finally, the decoder converts the predicted embeddings back into text or other modalities, ensuring semantic consistency with the original input.

The ability of LCMs to generate coherent text hinges on this seamless integration of encoding, reasoning, and decoding. By operating in a structured embedding space, the model can capture the relationships between concepts and generate meaningful continuations or transformations of input sequences. The flexibility of the decoder allows for outputs in different languages or modalities, depending on the task requirements, without necessitating retraining or additional data.

Looking forward, the potential for LCMs is immense. Future improvements could involve developing embedding spaces specifically optimized for conceptual reasoning, moving beyond generic solutions like SONAR. Extending the level of abstraction to encompass paragraphs or sections could unlock new capabilities in long-form text generation and document analysis. Hybrid architectures that combine the strengths of token-level models with concept-level reasoning might also bridge the current gap in instruction-following tasks. Moreover, incorporating additional modalities, such as visual or gestural data, could further expand the applicability of LCMs to areas like education, accessibility, and human-computer interaction.

The conceptual shift represented by LCMs has profound implications for the future of AI. By modeling reasoning and meaning at a semantic level, these systems provide a path toward more human-like understanding and interaction. As research progresses, LCMs have the potential to redefine how machines represent and process information, bringing us closer to the goal of truly intelligent systems.

Large Concepts Models, therefore, are another piece to recompose the puzzle that will allow more intelligent and performing systems that can approach the so-called AGI (Artificial General Intelligence). Below is a detailed list of open research topics in the field of LLMs:

  • Multimodal Integration: Training models that process diverse modalities (text, images, audio, video) to develop broader contextual understanding and reasoning abilities.
  • Hierarchical Reasoning: Incorporating explicit reasoning structures, like Large Concept Models (LCMs), to operate at multiple abstraction levels for coherent, long-term planning.
  • Retrieval-Augmented Generation (RAG): Combining LLMs with external knowledge bases to enhance factual accuracy and reduce hallucinations.
  • Memory-Augmented Models: Developing persistent memory architectures for LLMs to store and recall knowledge dynamically across interactions.
  • Embodiment and Simulation: Training AI systems in physical or simulated environments to foster embodied cognition and interactive learning.
  • Self-Supervised Learning (SSL): Leveraging vast unlabeled data to improve representations without explicit supervision, advancing generalization and robustness.
  • Continual Learning: Developing mechanisms for incremental knowledge acquisition without forgetting past information (mitigating catastrophic forgetting).
  • Prompt Engineering and Fine-Tuning: Refining prompts or using adaptive fine-tuning strategies to align LLM outputs with specific tasks or ethical standards.
  • Modular Architectures: Splitting tasks among specialized sub-models that collaborate for broader, more efficient problem-solving.
  • Meta-Learning: Enabling models to learn how to learn, generalizing quickly to new tasks with minimal training.
  • Neuro-Symbolic Approaches: Combining neural networks with symbolic reasoning for interpretable and logic-driven decision-making.
  • Causal Reasoning: Integrating causality into LLMs to enable better reasoning and decision-making beyond statistical correlations.
  • Energy-Efficient Training: Investigating low-resource training methodologies, such as sparse transformers and quantized architectures, for scalability.
  • Evolutionary Algorithms: Applying optimization inspired by biological evolution to explore model architectures and strategies.
  • Large Context Windows: Extending LLMs' capacity to handle longer contexts for coherent long-form reasoning and memory.
  • Alignment and Alignment-Based Architectures: Ensuring LLMs align with human values and goals, incorporating reinforcement learning from human feedback (RLHF).
  • Distributed Multi-Agent Systems: Creating networks of interacting agents to simulate collaborative intelligence and emergent behavior.
  • Transformer Alternatives: Exploring architectures beyond transformers (e.g., RNN variants, Perceiver, or liquid neural networks) for flexibility and efficiency.
  • Sparse Models: Utilizing sparsity in parameter usage to scale model sizes without corresponding resource costs.
  • Open-Ended Exploration: Developing AI systems capable of self-guided exploration and intrinsic motivation to learn autonomously.
  • Simulated Interiority: Training LLMs to create intermediate symbolic representations that mimic introspective thought.
  • Ethical and Societal Alignment: Embedding ethical reasoning, bias mitigation, and societal impact analysis into AI development.


To learn more about Large Concepts Models:

Download the Paper

GitHub repository 

________________________

[1] The LCM team, LargeConceptModels: LanguageModelinginaSentenceRepresentationSpace, arXiv, 2024



Tuesday, December 31, 2024

Fractal Happy New Year 2025 from CIPARLABS!

 

 

Fractal Happy New Year 2025 from CIPARLABS!

As we step into 2025, CIPARLABS reflects on a year of exceptional multidisciplinary research spanning artificial intelligence and neural networks, energy systems, healthcare, and complex systems theory. This year we wish you a happy "fractal" 2025 to underline our scientific approach to the problems we are going to solve and which concerns the science of complexity.

The synergy between AI and complexity science has unlocked innovative solutions to societal challenges. From revolutionizing energy grids to enhancing medical diagnostics, our work exemplifies how common frameworks can empower diverse fields. This post celebrates our achievements, highlighting the unity of disciplines and the endless possibilities of a collaborative future.


2024 Highlights: Research Achievements

1. Transformative Advances in Energy Management

  • Battery Modeling for Renewable Energy Communities: A Thevenin-based equivalent circuit model optimized energy management strategies, balancing computational efficiency and accuracy in predicting battery performance.
  • Energy Load Forecasting Breakthrough: Novel integration of second-derivative features into machine learning models like LSTM and XGBoost significantly improved predictions for peak energy demands, enhancing microgrid stability.
  • Smart Grid Fault Detection: The Bilinear Logistic Regression Model enabled interpretable AI-driven fault detection, ensuring resilient energy infrastructures.

2. Innovations in Healthcare through Explainable AI

  • Melanoma Diagnosis: Developed a custom CNN with feature injection, utilizing Grad-CAM, LRP, and SHAP methodologies to interpret deep learning predictions. This workflow sets a benchmark for explainability in computer-aided diagnostics.
  • Text Classification in Healthcare Discussions: Conducted a comparative study of traditional and transformer-based models (BERT, GPT-4) to classify Italian-language healthcare-related social media discussions, combating misinformation effectively.

3. Exploring Human vs. Machine Intelligence

  • Using complex systems theory and Large Language Models, we analyzed GPT-2’s language generation dynamics versus human-authored content. The study revealed distinct statistical properties, such as recurrence and multifractality, informing applications like fake news detection and authorship verification.

Future Directions: Looking Ahead to 2025

CIPARLABS aims to deepen its focus on explainable AI for critical applications in energy, healthcare, and language modeling. We are committed to expanding our interdisciplinary efforts, incorporating insights from philosophy, complex systems, and AI ethics. Future work will include:

  • Integrating advanced multimodal AI systems in healthcare.
  • Scaling energy solutions to diverse legislative frameworks worldwide.
  • Further bridging AI and human cognition to enhance ethical and transparent AI systems.

List of Published Papers (2024)

An Online Hierarchical Energy Management System for Renewable Energy Communities
Submitted to: IEEE Transactions on Sustainable Energy

Improving Prediction Performances by Integrating Second Derivative in Microgrids Energy Load Forecasting
Published in: IEEE IJCNN 2024, IEEE

From Bag-of-Words to Transformers: A Comparative Study for Text Classification in Healthcare Discussions in Social Media
Published in: IEEE Transactions on Emerging Topics in Computational Intelligence

An Extended Battery Equivalent Circuit Model for an Energy Community Real-Time EMS
Published in: IEEE IJCNN 2024

Modeling Failures in Smart Grids by a Bilinear Logistic Regression Approach
Published in: Neural Networks, Elsevier

An XAI Approach to Melanoma Diagnosis: Explaining the Output of Convolutional Neural Networks with Feature Injection
Published in: Information, MDPI

Human Versus Machine Intelligence: Assessing Natural Language Generation Models Through Complex Systems Theory
Published in: IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE

Many other are in process!

Wednesday, December 18, 2024

Nuove Normative e Trend per le Rinnovabili e i Sistemi di Accumulo in Italia

 


La transizione energetica in Italia sta vivendo una fase fondamentale, con nuove normative e trend che stanno ridefinendo il panorama delle rinnovabili e dei sistemi di accumulo. Due sviluppi recenti meritano particolare attenzione: l'approvazione del Testo Unico sulle Rinnovabili e i dati aggiornati sul mercato dei sistemi di accumulo. Entrambi offrono una panoramica delle sfide e delle opportunità per il settore energetico nazionale.

Testo Unico sulle Rinnovabili: semplificazione e nuove regole

Approvato dal Consiglio dei Ministri, il Testo Unico sulle Rinnovabili (o Testo Unico FER) entrerà in vigore il 30 dicembre 2024. L'obiettivo principale è semplificare i complessi iter burocratici per la costruzione e la gestione di impianti rinnovabili, attraverso tre regimi amministrativi:

  1. Attività libera:

    • Esenzione da permessi e autorizzazioni per interventi che non interferiscono con beni tutelati o opere pubbliche.

    • Applicabile a impianti fotovoltaici fino a 12 MW (integrati) o 1 MW (a terra), turbine eoliche singole, impianti agrivoltaici fino a 5 MW e altre configurazioni specifiche.

    • Richiesta una cauzione per interventi su suoli non antropizzati.

  2. Procedura abilitativa semplificata (PAS):

    • Richiede una dichiarazione di disponibilità delle superfici, minimizzazione dell’impatto paesaggistico e polizza fideiussoria per i costi di ripristino.

    • Prevede oneri e compensazioni territoriali per impianti con potenza superiore a 1 MW.

    • Decadenza del titolo abilitativo in caso di mancato avvio o conclusione dei lavori entro i termini stabiliti.

  3. Autorizzazione unica (AU):

    • Competenza regionale per impianti fino a 300 MW; ministeriale per impianti offshore o >300 MW.

    • Include obblighi di ripristino e validità minima di 4 anni.

Zone di accelerazione: Entro maggio 2025 il GSE pubblicherà una mappatura delle aree disponibili per impianti rinnovabili. Regioni e Province Autonome adotteranno entro febbraio 2026 piani per semplificare ulteriormente gli iter autorizzativi.

Sistemi di Accumulo: flessione e opportunità

Il mercato italiano dei sistemi di accumulo sta vivendo dinamiche contrastanti. Dopo il boom legato al Superbonus, il segmento residenziale ha registrato un netto rallentamento, mentre il settore utility ha mostrato una crescita significativa.

Dati principali 2024

  • Segmento residenziale:

    • Calo del 25% nelle installazioni, -31% in potenza e -29% in capacità rispetto al 2023.

  • Settore commerciale e industriale (C&I):

    • Riduzione del 18% nelle installazioni, -29% in potenza e -11% in capacità rispetto all’anno precedente.

  • Scala utility:

    • Crescita esponenziale con +133% nelle installazioni, +532% in potenza e +2877% in capacità, trainata da progetti del capacity market e iniziative merchant non incentivati.

Criticità normative

  • La fine del Superbonus e le modifiche nelle detrazioni fiscali hanno inciso negativamente sul segmento residenziale.

  • Il Testo Unico Rinnovabili presenta incertezze sugli iter autorizzativi per i sistemi di accumulo, con possibili conflitti di competenza tra amministrazioni.

  • Anie Rinnovabili propone che le nuove norme si applichino solo ai progetti futuri e richiede armonizzazione normativa entro sei mesi.

Dati cumulati a settembre 2024

  • Sistemi di accumulo installati: 692.386 unità.

  • Potenza complessiva: 5.034 MW.

  • Capacità massima: 11.388 MWh.

Si può concludere che le rinnovabili e i sistemi di accumulo sono al centro della transizione energetica italiana. Mentre il Testo Unico sulle Rinnovabili promette di semplificare le procedure, il mercato dei sistemi di accumulo riflette le sfide legate alla normativa e alla fine di incentivi chiave. Tuttavia, i dati su scala utility dimostrano il potenziale di crescita del settore, segnalando opportunità per il futuro.

 

Fonte 1

Fonte 2

Thursday, December 5, 2024

An XAI Approach to Melanoma Diagnosis: Explaining the Output of Convolutional Neural Networks with Feature Injection

 

 https://www.mdpi.com/2078-2489/15/12/783

 

Explainable artificial intelligence (XAI) is becoming a cornerstone of modern AI applications, especially in sensitive fields like healthcare, where the need for transparency and reliability is paramount. Our latest research focuses on enhancing the interpretability of convolutional neural networks (CNNs) used for melanoma diagnosis, a field where accurate and trustworthy tools can significantly impact clinical practice.

Melanoma is one of the most aggressive forms of skin cancer, posing challenges in diagnosis due to its visual similarity to benign lesions. While deep learning models have demonstrated remarkable diagnostic accuracy, their adoption in clinical workflows has been hindered by their "black box" nature. Physicians need to understand why a model makes specific predictions, not only to trust the results but also to integrate these tools into their decision-making processes. In this context, our research introduces a novel workflow that combines state-of-the-art XAI techniques to provide both qualitative and quantitative insights into the decision-making process of CNNs. The uniqueness of our approach lies in the integration of additional handcrafted features, specifically Local Binary Pattern (LBP) texture features, into the CNN architecture. These features, combined with the automatically extracted data from the neural network, allow us to analyze and interpret the network's predictions more effectively.

The study leverages public datasets of dermoscopic images from the ISIC archive, carefully balancing training and validation datasets to ensure robust results. The modified CNN architecture features five convolutional layers followed by dense layers to reduce dimensionality, making the network’s internal processes more interpretable. Alongside dermoscopic images, the network is fed LBP features, which are injected into the flattened layer to augment the learning process.

To explain the model's predictions, we employed two key XAI techniques: Grad-CAM and Layer-wise Relevance Propagation (LRP). Grad-CAM generates activation maps that highlight regions of the image influencing the network’s decisions, while LRP goes further by assigning relevance scores to individual pixels. Together, these methods provide a visual explanation of the decision-making process, helping to identify which areas of an image the model considers most important for classification. Interestingly, we observed that LRP was particularly effective in distinguishing clinically relevant patterns, while Grad-CAM occasionally identified spurious correlations. For a quantitative perspective, we used the kernel SHAP method, grounded in game theory, to assess the importance of features in the network’s predictions. This analysis revealed that most of the classification power - approximately 76.6% - came from features learned by the network, while the remaining 23.4% was contributed by the handcrafted LBP features. Such insights not only validate the role of feature injection but also open avenues for integrating diagnostically meaningful features, such as lesion asymmetry or border irregularities, into future models.

The performance of our modified CNN surpassed both our earlier work and other state-of-the-art approaches, achieving an accuracy of 98.41% and an AUC of 98.00% on the external test set. These results underscore the effectiveness of our interpretability framework, proving that improving transparency does not necessarily compromise accuracy that can be enhanced.

While this research marks significant progress, it also highlights areas for future exploration. The use of handcrafted features with limited diagnostic value, such as LBP, points to the need for incorporating features more aligned with clinical evaluation, like the ABCDE rule used for melanoma assessment. Moreover, involving dermatologists in the evaluation process could provide valuable qualitative feedback to refine the interpretability methods further.

This work demonstrates that XAI is a tool for explaining AI decisions and a a critical component for building trust in AI systems, especially in high-stakes fields like medical diagnostics. By combining visual and quantitative explanations, we hope to bridge the gap between AI and clinical practice, paving the way for broader adoption of AI-assisted tools in healthcare. Through this transparent and interpretable approach, we aim to empower clinicians, enhance diagnostic accuracy, and ultimately improve patient outcomes.

Here the paper: https://www.mdpi.com/2078-2489/15/12/783







SAXPY and GAXPY: The Algebra Behind Modern AI

At the Department of Information, Electronics and Telecommunications Engineering (DIET) at Sapienza University of Rome we fondly r...