Gap Detection (Ghost Hunting)¶
Find frequently-cited papers and authors that are missing from your corpus—structural gaps in your literature review.
What is Ghost Hunting?¶
Ghost hunting uses citation data to identify works that are referenced by your corpus but not included in it. This reveals:
- Seminal works you may have overlooked
- Key researchers whose voices are missing
- Intellectual foundations of your field
Prerequisite
Ghost hunting requires running update_citations() first to fetch OpenAlex data.
Ghost Hunting Modes¶
| Mode | What It Finds |
|---|---|
bibliographic |
Papers cited by your corpus but not included |
authors |
Researchers cited frequently but not represented |
Bibliographic Ghosts¶
Find missing papers that are frequently cited by your corpus:
from literature_mapper.ghosts import GhostHunter
hunter = GhostHunter(mapper)
# Find papers cited by at least 3 corpus papers
ghosts = hunter.find_bibliographic_ghosts(threshold=3)
if not ghosts.empty:
print(f"Found {len(ghosts)} missing papers:\n")
for _, row in ghosts.head(5).iterrows():
print(f" [{row['citation_count']:2d}Ă—] {row['title'][:50]}...")
print(f" {row['author']} ({row['year']})")
else:
print("No gaps found (run update_citations() first)")
Interpreting Results¶
| Column | Meaning |
|---|---|
citation_count |
Number of corpus papers citing this work |
title |
Title of the missing paper |
author |
First author or author list |
year |
Publication year |
Actionable Gap Analysis
Papers with high citation counts (cited by many corpus papers) are strong candidates to add to your literature review.
Missing Authors¶
Find researchers who are frequently cited but don't have papers in your corpus:
# Find authors cited by at least 5 papers
missing = hunter.find_missing_authors(threshold=5)
if not missing.empty:
print("Frequently cited authors not in corpus:\n")
for _, row in missing.head(5).iterrows():
print(f" {row['author']} (cited by {row['cited_by_papers']} papers)")