摘要:Video (used as a form of examination or homework) as an efficient approach for examining students' abilities is drawing increasing attention in the education field. How to assess video assignments effectively and accurately has become a significant topic in academia. This work proposes a method based on a multi-channel CNN-LSTM hybrid architecture to extract and classify image features such as students' actions and expressions, as well as audio features such as speech rates and pauses in the video assignments, and then conducts a two-category assessment of "qualified" or "unqualified". Additionally, build this system in a cloud computing environment as a Cloud-based Intelligent Evaluation Service application could provide universal service to meet the needs of multiple teaching units. The proposed method is shown to be feasible and effective through experiments.