Abstract
- The preparation and evaluation of scientific research proposals leads to a high workload for scientists, which has increased significantly in recent years. At the same time, information retrieval, machine learning and semantic technologies have the potential to reduce the workload of scientists, improve the quality of scientific texts and analyse metadata. The paper focuses on the determination of prediction probabilities for the approval or rejection of a research proposal on the basis of formal features using machine learning methods. For this purpose, an experiment with the five machine learning classifiers Boosted Tree Classifier (F1), Random Forest Classifier (F2), Decision Tree Classifier (F3), SVM Classifier (F4) and Logistic Classifier (F5) was conducted on a data corpus of research proposals. The main result is that statistical features from metadata alone can make a high contribution to the assessment of the approval or non-approval of a research proposal; the average accuracy [1] across five methods is 75.04 %.