Analyzing the Sentiment of MOOC Discussion Posts


  • Haniya Ahmed WISEST student researcher, University of Alberta
  • Kenny Wong Department of Computer Science, University of Alberta



sentiment, MOOC


The purpose of the project is to identify common difficulties that learners may face and to understand their emotions as they progress through MOOCs. MOOC is an abbreviation for the Massive Open Online Course and the research deals with the data from ten different courses from Coursera. The data is used to extract pieces of text that students have made. Then, those certain texts are required to be sent to Google Cloud Natural Language API. This app allows users to get a sentiment analysis of a text. The main goal is to assist instructors with monitoring MOOC to make it more efficient and easier for students to progress since it assists to improve the courses.

 To achieve this, the first step is to gather all the data from each of the courses. Then use programming to dump all that data into one big database. The program that is used here is called Pycharm and user is required to use python and sql to aid him in dumping the data in the database. Once the database is created, coding is done to only select out the pieces of information that are needed. These texts should be where students make comments or ask questions. Next, the data is queried to send these texts to Google Cloud Natural Language API. Here, the program breaks down all the sentences to only be just words. Then the program is going to categorize each word according to whether its connotation is positive, negative or neutral. Next, all the words are sorted according to their connotations. The overall sentiment depends on the emotion that has the highest number. If positives and negatives are all balanced out then the sentiment is neutral. Sentiment scores range from -1 to 1, where -1 is the most negative, 1 is the most positive and anywhere near 0 is neutral.

 Positive sentiment scores indicate instructors that students are doing well on their course and neutral sentiment scores indicate that the course is balanced out with difficulties and easy tasks. However, negative sentiment is the most important to instructors since it indicates them that students are struggling and they need to improve the course.