<p dir="ltr"><i>The data files are not publicly available due to patient privacy issues. Please contact </i><i>kcteo@hku.hk for raw data access.</i></p><p dir="ltr">Intracerebral hemorrhage (ICH) is a subtype of stroke associated with high mortality and morbidity. Surgical treatments such as clot evacuation (CE) and external ventricular drainage (EVD) are most effective in reducing mortality and morbidity when performed within 4 to 8 hours after the ictus. To support timely clinical decision-making, numerous prognosis scoring systems have been established, yet no scores have been found to be accurate in predicting both mortality and morbidity. Therefore, using 957 spontaneous ICH patient samples provided by the University of Hong Kong stroke registry, this study aims to utilize machine learning (ML) tools to develop a prognosis system that effectively predicts six-month outcomes of both mortality and morbidity in ICH patients. The sample outcomes are recorded in modified Rankin Scale (mRS), which are dichotomized into good (mRS 0-2), poor (mRS 3-5) and death (mRS 6). The resulting prognosis prediction ML model, based on Random Forest algorithm, achieved an overall accuracy of 0.81, with AUROCs of 0.93, 0.84 and 0.95 for good, poor, and death outcomes, respectively. Compared to the original ICH score–currently widely used by clinicians–the ML prognosis prediction model obtained a significantly higher overall accuracy (p<0.001), precision (p<0.001), recall (p<0.01), and AUROC for functional outcomes (p<0.001). Feature importance analysis of the ML model revealed the significance of novel prognostic features such as upper and lower limb power, admission pulse rate, and Graeb score. Further applying the ML model to counterfactual analysis through counterfactual predictions on 124 surgically treated patients identified an overall benefit on average treatment effect (ATE) of 0.290 (±0.1260), with 0.273 (±0.1434) ATE for CE and 0.333 (±0.2554) ATE for EVD individually. Treatment effects were based on model-predicted outcomes: class 0 (mRS 0-2), class 1 (mRS 3-5), and class 2 (mRS 6). The ATE results were validated through benchmarking against two standard causal inference methods: propensity score matching with caliper threshold and inverse probability of treatment weighting with weighted least squares. This study demonstrates that ML-based prognostic models can significantly improve the accuracy of predicting six-month outcomes in ICH patients and provide valuable insights into treatment effects, potentially guiding more precise clinical decision-making.</p>