Given the limitations of spatial expression in two-dimensional drawings and static renderings of traditional three-dimensional architectural design, leading to spatial cognitive bias and delayed feedback from customers, this paper proposes an immersive visualization and real-time interactive collaborative framework based on virtual reality. A BIM-VR (Building Information Modeling-Virtual Reality) dynamic coupling system has been built, and two-way real-time control of professional design parameters in a virtual environment has been achieved through a three-dimensional model lightweight algorithm and multi-modal natural interaction technology. Based on the model optimisation pipeline that combines the octree spatial subdivision algorithm and progressive mesh simplification technology, the triangle face conversion from the Revit model to the Unity scene has been realised. The Unity HDRP (High Definition Render Pipeline) real-time rendering engine will be used to integrate an IFC4 (Industry Foundation Classes 4) standard metadata parsing module, set up a dynamic binding mechanism for building parameters and virtual scenes, and achieve real-time linkage of key design attributes. The Leap Motion gesture recognition framework is used to build a semantic gesture library of 8 types of building operations, and a voice command parsing system based on a fine-tuned BERT (Bidirectional Encoder Representations from Transformers) model is employed to achieve natural interaction for non-professional users. Tobii eye trackers are employed to collect data on where users look, and then LSTM (Long Short-Term Memory) networks are used to determine which areas of the design have been focused on by users and suggest parameter modifications. Modify the modification instruction in the BIM Design Platform in real time using the WebSocket protocol. Based on the above experimental results, the model uses a lightweight design that reduces geometric redundancy significantly through octree spatial subdivision and progressive mesh simplification; thus, it can be loaded in just 6.7 seconds and achieve an average frame rate of 62.2fps while boosting the component recognition accuracy for non-professional users to over 0.93. This study confirms that virtual reality technology is feasible for creating an immersive collaborative environment in architectural design and offers concrete technical paths to enhance the efficiency of design communication and the scientific basis of decision-making in the customer-designer collaboration model.