To address the issues of low efficiency in identifying safety hazards and weak targeted risk prevention during directional drilling operations in coal mines, this paper constructs a safety hazard identification model integrating text mining, probabilistic prediction, and association analysis. First, the native training task of the BERT model is enhanced by proposing a pre-trained language model for the coal mine safety domain based on DP-MLM and SOP. Second, a Markov model is employed to predict the probability of hazard occurrence, quantifying the state transition process of the hazard monitoring system. An enhanced MM-Apriori algorithm effectively uncovers latent associations within hazard transactions. Case analysis reveals strong association rules and risk assessments. “HSE Management => Absent” is the management principle that possesses a support value of 0.7353, while “Inspection => Inadequate” has a support value of 0.7186. “Non-compliance => Not” and “Not => Non-compliance” are related bidirectionally, with the former having a support rate of 0.4923, while the latter has a support rate of 0.3882. The lack of strict technical standards and operational processes in relevant units or the neglect of technical standards by construction workers can affect the quality of construction and cause safety accidents.