Integrating CO2 capture with biomass-fired combined heat and power (bio-CHP) plants is a promising method to achieve negative emissions. However, the use of versatile biomass, including waste, and the dynamic operation of bio-CHP plants leads to large fluctuations in the flowrate and CO2 concentration of the flue gas (FG), which further affect the operation of post-combustion CO2 capture. To optimize the dynamic operation of CO2 capture, a reliable model to predict the FG flowrate and CO2 concentration in real time is essential. In this paper, a data-driven model based on the Transformer architecture is developed. The model validation shows that the root mean squared error (RMSE), mean absolute percentage error (MAPE), and Pearson correlation coefficient (PPMCC) of Transformer are 0.3553, 0.0189, and 0.8099 respectively for the prediction of FG flowrate; and 13.137, 0.0318, and 0.8336 respectively for the prediction of CO2 concentration. The potential impact of various meteorological parameters on model accuracy is also assessed by analyzing the Shapley value. It is found that temperature and direct horizontal irradiance (DHI) are the most important factors, which should be selected as input features. In addition, using the near-infrared (NIR) spectral data as input features is also found to be an effective way to improve the prediction accuracy. It can reduce RMSE and MAPE for CO2 concentration from 0.2982 to 0.2887 and 0.0158 to 0.0157 respectively, and RMSE and MAPE for FG flowrate from 4.9854 to 4.7537 and 0.0141 to 0.0121 respectively. The Transformer model is also compared to other models, including long short-term memory network (LSTM) and artificial neural network (ANN), which results show that the Transformer model is superior in predicting complex dynamic patterns and nonlinear relationships.