Ctrl K

LISA

Lithium Ion Solid State Assistant

7
mentions
8
contributors

Description

LISA — Lithium-Ion Solid-State Assistant

A retrieval-augmented research assistant for knowledge defragmentation in battery science

DOI
Code
License


What problem does LISA solve?

Battery research — and solid-state battery research in particular — is inherently cross-disciplinary. Electrochemists, materials scientists, engineers and data scientists produce a fragmented body of knowledge scattered across publications, technical reports, dissertations and project deliverables, often using inconsistent terminology. Finding and synthesising relevant information across this literature is a significant bottleneck for research progress.

LISA (Lithium-Ion Solid-State Assistant) addresses this directly. It is a domain-specific virtual research assistant built on Retrieval-Augmented Generation (RAG) — a technique that combines the broad language capabilities of large language models (LLMs) with targeted, evidence-based retrieval from a curated document corpus. Rather than relying on a model's parametric memory, LISA grounds every response in retrieved source passages, making its answers traceable and factually anchored.

The system was developed and validated using the document corpus of FestBatt, Germany's national competence cluster for solid-state battery research, as a case study.


How it works

The RAG principle applied to battery science
The RAG principle: a user query triggers retrieval from a document corpus, which is then used to augment the prompt before the LLM generates a grounded response.

At its core, LISA implements a standard RAG pipeline:

  1. Retrieval — the user's query is matched against a pre-indexed document corpus stored as dense vector embeddings in a vector database.
  2. Augmentation — the most relevant document chunks are assembled into a structured prompt.
  3. Generation — an LLM produces a response grounded in the retrieved evidence.

LISA's basic RAG pipeline
Detailed view of the pipeline: documents are chunked and indexed in a vector database; at query time, the most relevant chunks are retrieved and assembled into a prompt for the LLM.

What distinguishes LISA from a generic RAG implementation is its retrieval architecture. The system combines three complementary strategies to maximise retrieval quality on technical scientific text:

  • Hybrid Search — parallel dense (semantic) and sparse (keyword-based) retrieval, capturing both conceptual similarity and exact terminology
  • Small-to-big Retrieval — documents are indexed at fine granularity (small chunks) but the retrieved context is expanded to larger surrounding passages before augmentation, preserving coherence
  • Reranking — retrieved candidates are reordered by a cross-encoder model before being passed to the LLM, improving precision

LISA's advanced retrieval architecture
The full Document Search module: hybrid dense/sparse retrieval feeds into a shared candidate pool, which is reranked before prompt synthesis.


Technical stack

LISA is implemented in Python and built on the LangChain framework. It integrates with:

  • Kadi4Mat as the shared virtual research environment for document management
  • Gradio for the interactive web interface
  • Configurable LLM backends (API-based)
  • Configurable embedding models and vector stores

The codebase is modular: ragchain.py, retrievers.py, embeddings.py, rerank.py, llms.py and vectorestores.py can be adapted independently, making LISA a reusable template for domain-specific RAG assistants beyond battery science.


Scope and generalisability

Although developed for solid-state battery research, LISA's architecture is domain-agnostic. Any field with a fragmented, multi-stakeholder document corpus — materials science, clinical research, engineering standards — can benefit from the same approach. The paper explicitly addresses this transferability and discusses evaluation methodology for RAG systems in scientific contexts.


Cite

@article{zhao2025lisa,
  title     = {LISA: A Lithium-Ion Solid-State Assistant using large language models
               for knowledge defragmentation in battery science and beyond},
  author    = {Zhao, Yinghan and Hansen, Anna-Lena and Dahlhaus, Anna and
               Brandt, Nico and Selzer, Michael and Koeppe, Arnd and
               Nestler, Britta and Knapp, Michael and Ehrenberg, Helmut},
  journal   = {Materials Today Communications},
  volume    = {45},
  pages     = {112380},
  year      = {2025},
  doi       = {10.1016/j.mtcomm.2025.112380}
}

Participating organisations

opencampus
Christian-Albrechts-Universität zu Kiel
Karlsruhe Institute of Technology

Reference papers

Mentions

Contributors

Anna-Lena Hansen
Innovation Manager
Christian-Albrechts-Universität
MS
Michael Selzer
AK
Arnd Koeppe
HE
Helmut Ehrenberg
BN
Britta Nestler
Hochschule Karlsruhe Technik und Wirtschaft