In some complex systems, because of the influence of internal and external factors, periodic changes occur among runtime stages, with each stage exhibiting distinct dynamics. When we employ data-driven parameterized methods to model and predict such systems, a unified model restricts the learning of the dynamics and transitions of multiple stages. To address the aforementioned challenges, inspired by the ordinary differential equations network (ODENet), this paper proposes a novel predictive simulation framework, referred to as the deterministic finite automaton ordinary differential equation net (DFA-ODENet). This framework is a continuous-time deep learning framework designed to model periodic multistage systems using irregularly–sampled historical system trajectories. The model includes two principal predictions for forecasting system dynamics and stage transition. In terms of learning the dynamics of the system, the model comprises several ODENets, whose number is determined from the number of stages of the modeled system. Each ODENet individually learns the continuous-time nonlinear dynamics within its respective stage. For learning the stage transitions, a stage transition predictor is employed to learn the duration of each stage from observational data. These stage transition predictors are prelabeled based on the prior knowledge of the system. During prediction, the stage transition predictor serves as a switcher for selecting the appropriate ODENet to predict the system outputs. Moreover, the framework incorporates a specific encoder–decoder structure, where the encoder solves the initial state based on historical system inputs and outputs, while the decoder predicts future system outputs using the inputs of the prediction window based on the solved initial state. To evaluate the feasibility and effectiveness of the proposed approach, the encoder–decoder framework is employed in a cooling system of a real data center to simulate specific dynamic variables during operation. After providing multivariate operational data, including server power and environmental temperature, the model successfully simulates the system behavior in the expected operational patterns and predicts the open-loop output variables, such as power consumption and inlet air temperature. Notably, when the prediction horizon extends beyond 30 min, the mean absolute percentage error of the predicted energy consumption remains <5%. Concurrently, the optimization of the cooling temperature settings, which determines when to pause the cooling compressor, is achieved through the learned simulation model. Simulation experiments indicate that the cooling energy is saved up to 18% by adopting the inferred optimal temperature settings.