Abstract
- Preparing and reviewing scientific research proposals imposes a considerable workload on researchers-one that has grown markedly in recent years. At the same time, methods from information retrieval, machine learning and semantic technologies offer new opportunities to reduce this burden, improve text quality and systematically analyse textual data. This paper investigates how reliably the approval or rejection of a research proposal can be predicted from textual features alone. Five machine-learning classifiers-Boosted Tree (F1), Random Forest (F2), Decision Tree (F3), Support Vector Machine (F4) and Logistic Regression (F5)-were trained and evaluated on a corpus of real proposals. The key finding is that text data by itself makes a substantial contribution to prediction.