With the development of virtual reality devices and real-time rendering technology, animation design has gradually shifted from linear playback to immersive interactive generation. Focusing on the problem of interactive animation design based on virtual reality technology, this paper constructs a method framework that integrates hierarchical management of 3D scene resources, skeleton binding and skin computation, motion capture input coding, gesture recognition, finite-state maneuver painting control and adaptive rendering load adjustment. The interactive animation prototype is implemented based on Unity 3D, OpenXR and C# script, and the user pose, hand trajectory, trigger events and running log are collected through the head display, two-hand controller and spatial positioning device. In the experiment, 36 users, 4 types of virtual scenes and 12480 valid interaction logs were selected for verification. The results show that the average frame rate of low complexity scene is 91.8FPS, and high dynamic complex scene still maintains 76.9 FPS. The interactive response delay is concentrated in 18-32 ms, the highest GPU occupancy rate is 84.7%, and the average user experience score is 4.42. Research shows that this method can improve the real-time performance, stability and immersive experience of virtual reality interactive animation, which has reference significance for intelligent animation design and virtual interaction system optimization.