PARNESS
← Back to AutoSurvey: LLM-Driven Automated Literature Review Generation

Experiments

(1)

Survey Quality Evaluation (Human Study)

10 topics, 12 expert evaluators
Metrics
Coherence (AutoSurvey)
4.1/5
Coherence (Human-written)
4.4/5
Coverage (AutoSurvey)
91.3%
Citation Accuracy
96.7%
Time Reduction
~95% (weeks → hours)
Downloadable Artifacts
evaluation_forms.pdfPDF · 3.2 MB
Coming soon
expert_scores.csvCSV · 450 KB
Coming soon
sample_surveys.zipArchive · 28 MB
Coming soon