API Reference¶

Complete method reference for the Literature Mapper Python API.

LiteratureMapper¶

The main entry point for corpus management.

from literature_mapper import LiteratureMapper

mapper = LiteratureMapper(corpus_path, model_name="gemini-3-flash-preview")

Processing¶

Method	Parameters	Returns	Description
`process_new_papers()`	`recursive=False`	`ProcessingResult`	Process unprocessed PDFs
`update_citations()`	`email=None`	`None`	Fetch OpenAlex citation data

Retrieval¶

Method	Parameters	Returns	Description
`get_all_analyses()`	—	`DataFrame`	All papers as DataFrame
`get_paper_by_id()`	`paper_id: int`	`dict`	Single paper details
`search_papers()`	`column: str, query: str`	`DataFrame`	Keyword search
`search_corpus()`	`query, semantic, use_enhanced, min_year, max_year, node_types, limit`	`list[dict]`	Semantic search
`get_statistics()`	—	`CorpusStats`	Corpus counts

Normalization¶

Method	Parameters	Returns	Description
`normalize_authors()`	`mappings: dict`	`int`	Merge author aliases
`normalize_concepts()`	`mappings: dict`	`int`	Merge concept aliases
`get_canonical_author()`	`alias: str`	`str`	Resolve author alias
`get_canonical_concept()`	`alias: str`	`str`	Resolve concept alias

Temporal¶

Method	Parameters	Returns	Description
`get_concept_timeline()`	`concept=None, top_n=10`	`DataFrame`	Concept temporal data
`compute_temporal_stats()`	—	`None`	Compute trend statistics
`get_trending_concepts()`	`direction, limit`	`DataFrame`	Rising/declining concepts
`detect_concept_eras()`	`gap: int`	`DataFrame`	Revival detection

Genealogy¶

Method	Parameters	Returns	Description
`build_genealogy()`	`verbose=False`	`dict`	Infer paper relationships
`find_contradictions()`	`concept=None`	`DataFrame`	CHALLENGES edges
`get_argument_evolution()`	`concept: str`	`DataFrame`	Temporal relationship trace

Synthesis¶

Method	Parameters	Returns	Description
`synthesize_answer()`	`question, year_range=None, limit=10`	`str`	RAG answer synthesis
`validate_hypothesis()`	`hypothesis: str, limit=10`	`dict`	Hypothesis evaluation

Export¶

Method	Parameters	Returns	Description
`export_to_csv()`	`output_path: str`	`None`	Export to CSV

GhostHunter¶

Gap detection for missing papers and authors.

from literature_mapper.ghosts import GhostHunter

hunter = GhostHunter(mapper)

Method	Parameters	Returns	Description
`find_bibliographic_ghosts()`	`threshold=3`	`DataFrame`	Missing cited papers
`find_missing_authors()`	`threshold=5`	`DataFrame`	Missing cited authors

CorpusAnalyzer¶

Corpus-level statistics and analytics.

from literature_mapper.analysis import CorpusAnalyzer

analyzer = CorpusAnalyzer(corpus_path)

Method	Parameters	Returns	Description
`get_year_distribution()`	—	`DataFrame`	Papers per year
`get_top_authors()`	`limit=10`	`DataFrame`	Most prolific authors
`get_top_concepts()`	`limit=10`	`DataFrame`	Most frequent concepts
`find_hub_papers()`	`limit=10`	`DataFrame`	Most referenced papers

Visualization¶

from literature_mapper.viz import export_to_gexf

export_to_gexf(corpus_path, output_path, mode="semantic", threshold=0.05)

Parameter	Type	Default	Description
`corpus_path`	`str`	—	Path to corpus directory
`output_path`	`str`	—	Output `.gexf` file path
`mode`	`str`	`"semantic"`	Graph type: `semantic`, `authors`, `concepts`, `river`, `similarity`
`threshold`	`float`	`0.05`	Minimum edge weight (for `similarity` mode)

Data Classes¶

ProcessingResult¶

@dataclass
class ProcessingResult:
    processed: int    # Number of successfully processed PDFs
    skipped: int      # Already in database
    failed: int       # Processing errors

CorpusStats¶

@dataclass
class CorpusStats:
    total_papers: int
    total_authors: int
    total_concepts: int