Against the backdrop of “Internet Plus,” the connotation of English learners’ learning abilities continues to evolve and adjust. This paper employs principal component analysis (PCA) to screen effective data elements of English graduates’ competencies, clarify the logical relationships among various learning ability factors and their contribution rates to the competency structure model, and identify core principal components. By integrating learner profile characteristics and collecting online learning data from multiple students across four semesters of a two-year English program, the study identifies attribute features for constructing English learner profiles. A data analysis model is established using the DPCA method combined with a modified K-means algorithm for selecting initial cluster centers. Through multidimensional feature reduction and extraction of student behavioral data, an objective and detailed learner profile is depicted. Using the DPCA-K-means algorithm, English learners can be categorized into four groups: excellent learners, efficient learners, low-level learners, and high-risk learners. High-efficiency learners demonstrated strong performance in task completion, video viewing progress, and chapter test completion, though their completion rates were not the highest. They exhibited the highest video replay rates and participated most frequently in online discussions, indicating proactive and motivated learning.