RLMs for Research Assistants: What Works Today

Every research assistant I’ve ever mentored eventually hits the same wall. You’re three days deep into a literature review, armed with a spreadsheet of fifty PDFs, and you realize you’re reading the same introduction for the tenth time. The papers cite each other in a web that feels dense but surprisingly shallow once you start mapping it. You find a citation from 2018 that references a foundational paper from 1997, but the 2018 paper misinterprets the methodology, and now you’re stuck deciding whether to trace the error back or follow the correct lineage forward. This is the exact moment where linear processing fails, and recursive thinking becomes a survival mechanism.

The Anatomy of Recursive Review

Most research assistants approach literature review like a breadth-first search. They collect papers, skim abstracts, and move on. It’s efficient for coverage but disastrous for depth. Recursive approaches flip this. They treat the literature review not as a collection of documents but as a graph traversal problem where nodes are papers and edges are citations, methodological overlaps, or conceptual dependencies.

When you start with a seed paper—say, a recent survey on transformer architectures—you don’t just read it. You extract its bibliography, but you also reverse-engineer its citation graph. Who cites this paper? And more importantly, who cites it in ways that extend, contradict, or refine its claims? Tools like Connected Papers or Citation Gecko automate this graph generation, but the real work happens in the recursive loop: read paper A, identify paper B that A cites as foundational, read B, discover that B’s approach was actually improved by paper C in a different domain, read C, and suddenly you’re tracing a lineage that crosses subfields.

This isn’t just about finding more papers. It’s about understanding the evolution of ideas. A linear review might tell you that “attention mechanisms” were introduced in 2017. A recursive review reveals that the original formulation was computationally expensive, leading to sparse attention variants in 2019, which then inspired efficient transformers in 2020, and finally converged with state-space models in 2023. Each step requires revisiting earlier papers with new context. You read the 2017 paper not just for its content but to see what the 2023 paper is implicitly responding to.

Building the Citation Graph

The first practical step is constructing a directed graph where edges represent citations. But not all citations are equal. A recursive approach distinguishes between:

Foundational citations: Papers that established the core methodology.
Incremental citations: Papers that tweak parameters or apply the method to new datasets.
Contesting citations: Papers that challenge assumptions or present contradictory evidence.

Manually tagging citations this way is tedious, but it’s where the insights hide. For example, in the field of reinforcement learning, the citation graph around Deep Q-Networks (DQN) is dense with incremental work. But the recursive researcher notices that the most influential later papers aren’t those that incrementally improve DQN—they’re the ones that contest its fundamental assumptions about value function approximation. This leads you to papers like “Why Deep Q-Networks Overestimate Action Values” (2017), which isn’t just another DQN paper; it’s a pivot point that redirects the entire subfield toward distributional RL.

Automating this classification is an active area of research. Natural language processing models can now detect whether a citation is supportive, critical, or neutral by analyzing the surrounding text. For instance, the CiteWorth system uses BERT-based models to predict citation importance based on context. While these tools aren’t perfect, they drastically reduce the manual labor of graph annotation.

Cross-Paper Synthesis as Pattern Recognition

Synthesis is where recursive review shines. Instead of summarizing each paper in isolation, you’re looking for patterns across the graph. This is akin to solving a jigsaw puzzle where the pieces are methodologies, results, and limitations, and the picture is the current state of knowledge.

Consider a research assistant tasked with reviewing “few-shot learning” in computer vision. A linear approach might produce a list of papers, each with a one-paragraph summary. A recursive approach produces a matrix:

Paper	Base Architecture	Meta-Learning Approach	Key Limitation	Cited By
MAML (2017)	ResNet-18	Gradient-based adaptation	Computationally expensive	Reptile, ProtoNets
ProtoNets (2019)	ConvNet	Prototype-based classification	Struggles with dissimilar classes	Meta-Baseline (2020)
Meta-Baseline (2020)	ResNet-34	Simple cosine similarity	Lacks adaptation during test time	EPNet (2021)

This matrix isn’t just a summary—it’s a map of dependencies and evolutions. You notice that Meta-Baseline deliberately strips away complex meta-learning to achieve better results, which suggests a paradigm shift from “learning to learn” to “learning good representations.” This insight only emerges when you recursively compare methodologies across papers, not when you read them sequentially.

Tools like Litmaps excel at this visual synthesis. They allow you to see clusters of papers around specific methodologies, making it easy to spot outliers or emerging trends. But the real work is interpretive. You have to ask: Why did this cluster form? What does it mean that two papers from different years share nearly identical architectures but produce divergent results? The answers often lie in subtle details—dataset splits, hyperparameter choices, or even hardware constraints—that recursive review uncovers.

Citation Tracking and the “Reference Lag”

One of the most frustrating aspects of academic publishing is the reference lag. A paper published today might cite work from five years ago, but the true impact of that cited work won’t be visible until newer papers start citing it. Recursive review accounts for this by treating citation tracking as a time-series problem.

Imagine you’re tracking the impact of a seminal paper like “Attention Is All You Need” (2017). A linear review would stop at the papers that directly cite it. A recursive review follows the citation chain forward and backward in time:

Backward: What papers does the 2017 paper cite? The original transformer paper cites earlier work on RNNs and LSTMs, but it also cites a lesser-known 2015 paper on encoder-decoder architectures for machine translation. That 2015 paper is the hidden gem that inspired the transformer’s core design.
Forward: Which papers cite the transformer paper? But more importantly, which papers cite it in ways that reveal a trend? For example, the rise of “sparse transformers” (2020) and “linear transformers” (2021) both cite the original paper, but their citations are framed as solutions to the computational limitations of the original architecture.

This temporal recursion is critical because it reveals the half-life of ideas. Some papers have a short half-life—they’re cited heavily for a year and then fade. Others, like the transformer paper, have a long half-life, continuously influencing new subfields. By tracking citation velocity (how quickly a paper is cited after publication) and citation decay (how quickly citations drop off), you can predict which ideas will endure.

Tools like Semantic Scholar and Google Scholar provide citation metrics, but they’re often lagging indicators. Recursive review uses these metrics as inputs to a larger model of intellectual evolution. For instance, if a paper has a high citation velocity but low methodological diversity (i.e., it’s cited only within its own subfield), it might be a bubble. Conversely, a paper with moderate citation velocity but high cross-disciplinary citations (e.g., cited in both computer vision and neuroscience) is likely foundational.

Practical Tools for Recursive Review

While the conceptual framework is powerful, recursive review is only feasible with the right tools. Here’s a breakdown of what works today, from open-source libraries to commercial platforms.

Graph-Based Literature Discovery

Connected Papers is the most user-friendly tool for visualizing citation graphs. You input a seed paper, and it generates a graph of related papers, colored by publication year and sized by citation count. The graph is interactive—you can click on nodes to see abstracts and export citations. However, it’s limited to papers indexed by Semantic Scholar, which covers most of computer science and physics but misses some humanities and social science fields.

Citation Gecko takes a different approach. Instead of starting from a seed paper, you upload a list of papers (e.g., from a CSV), and it finds papers that cite at least two of them. This is incredibly useful for identifying “bridge papers” that connect disparate clusters. For example, if you have papers on NLP and papers on computational biology, Citation Gecko might find a paper that applies transformers to protein folding, bridging the two fields.

For developers, the OpenAlex API is a game-changer. OpenAlex is an open-source alternative to proprietary databases like Web of Science. It provides comprehensive metadata, citation graphs, and concepts (tags assigned by NLP models). You can programmatically traverse citation graphs, filter by concept, and even track the evolution of a concept over time. Here’s a simple Python example to fetch papers citing a specific work:

import requests

def get_citations(work_id):
    url = f"https://api.openalex.org/works/{work_id}/citations"
    response = requests.get(url)
    return response.json()

# Example: Fetch citations for "Attention Is All You Need"
citations = get_citations("W2031326893")
for paper in citations["results"]:
    print(paper["title"], paper["publication_year"])

This script returns a list of papers that cite the transformer paper, along with their publication years. You can extend this to build a full citation graph, though you’ll need to handle rate limits and pagination.

Synthesis and Note-Taking Tools

Once you’ve gathered papers, you need a system for recursive synthesis. Traditional note-taking apps like Evernote or OneNote are linear—they don’t handle bidirectional links well. Instead, consider Obsidian or Roam Research, which are built around graph-based thinking.

In Obsidian, you create a note for each paper, with tags for methodologies, datasets, and limitations. You then link notes using bidirectional links. For example, the note for “MAML” might link to “Reptile” (which cites it) and to “ProtoNets” (which contests it). Over time, the graph view reveals clusters and gaps. You might notice that all papers in a cluster share a common limitation, suggesting a research opportunity.

For collaborative review, Zotero with the Obsidian Zotero Integration plugin is powerful. You can export Zotero citations directly into Obsidian notes, preserving metadata and annotations. This creates a seamless workflow: collect papers in Zotero, analyze them recursively in Obsidian, and export insights for publication.

Automated Citation Analysis

Manual citation tracking is error-prone. Automated tools can help, but they require careful validation. The scikit-learn library includes tools for text analysis that can be adapted for citation context extraction. For example, you can use TF-IDF to identify key terms in a paper’s bibliography and then cluster papers based on shared terminology.

More advanced approaches use BERT-based models fine-tuned on academic text. The SciBERT model, trained on scientific papers, can classify citations as supportive, critical, or neutral with reasonable accuracy. Here’s a simplified example using the Hugging Face Transformers library:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("allenai/scibert_scivocab_uncased")
model = AutoModelForSequenceClassification.from_pretrained("allenai/scibert_scivocab_uncased")

def classify_citation(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
    outputs = model(**inputs)
    logits = outputs.logits
    probabilities = torch.softmax(logits, dim=1)
    return probabilities.argmax().item()

# Example: Classify a citation context
context = "As shown in [12], the transformer architecture outperforms LSTMs on translation tasks."
citation_class = classify_citation(context)
print(citation_class)  # 0: Neutral, 1: Supportive, 2: Critical

This model isn’t perfect—citation context is nuanced—but it can automate the initial tagging of hundreds of citations, freeing you to focus on interpretation.

Challenges and Limitations

Recursive review isn’t a silver bullet. It’s computationally intensive, both for humans and machines. Building a citation graph for a broad field like “machine learning” can involve tens of thousands of papers, making manual analysis impractical. Even with automation, you risk information overload.

Another challenge is bias. Citation graphs are influenced by academic prestige, publication venue, and language. Papers from top conferences like NeurIPS or CVPR are overrepresented, while work from smaller labs or non-English sources might be missed. Recursive review requires conscious effort to include diverse voices, perhaps by manually adding papers from preprint servers like arXiv or regional conferences.

Finally, there’s the problem of “citation pollution.” Some papers cite works superficially, without engaging with their content. Others engage in citation cartels—groups of authors who cite each other excessively to boost metrics. Recursive review must account for this by weighting citations based on contextual relevance, not just count.

The Human Element

Despite all these tools, the most critical component is the human researcher. Recursive review is a dialogue between you and the literature. You’re not just extracting information; you’re interpreting it, questioning it, and connecting it to your own knowledge. This is where the “art” of research lives.

For example, when I was reviewing papers on neural architecture search (NAS), I noticed a recurring pattern: many papers reported state-of-the-art results on standard benchmarks but failed in real-world deployments. This wasn’t obvious from reading abstracts alone. It emerged from recursively comparing limitations sections, talking to practitioners, and eventually realizing that NAS papers often overfit to benchmark characteristics. That insight led me to a deeper dive into papers on robust NAS, which ultimately shaped my own research direction.

Tools can’t replicate this kind of insight. They can only facilitate the process. The recursive researcher is a detective, piecing together clues from a vast library of ideas. The tools are the magnifying glass, the graph is the map, but the intuition is yours.

Looking Ahead

The future of recursive review is likely more automated, but not fully autonomous. We’re seeing the rise of AI systems that can generate literature reviews, but they lack the nuanced understanding of a human expert. The sweet spot is human-AI collaboration: AI handles the heavy lifting of graph construction and pattern detection, while the human provides context, judgment, and creativity.

Already, tools like Elicit and Scite are pushing in this direction. Elicit uses language models to answer research questions by synthesizing findings from multiple papers. Scite shows how a paper is cited—whether it’s supported, contradicted, or mentioned neutrally. These tools are still evolving, but they point toward a future where recursive review is faster, deeper, and more integrated.

For now, the best approach is to combine old-school rigor with new-school tools. Read deeply, build graphs, question assumptions, and stay curious. The literature is a living organism, constantly growing and changing. Recursive review is the best way to keep up.