Advancing Education with Data

2017 KDD Workshop
August 14, 2017

Online learning platforms have grown tremendously in recent years, having an impact from K-12 to lifelong learning. In the classroom, educators are starting to use technology to adjust their teaching methods for better personalization. In companies, employers are leveraging online learning platforms for the delivery of training programs that enable employees to gain skills necessary for their jobs. Professionals are leveraging online learning applications to stay up to date with the skills required to stay relevant in the dynamic job market.

The data collected through these technologies presents a golden opportunity to develop data-driven methods to improve education. We can now study the impact of online learning platforms on lifelong learners, including those from non-traditional backgrounds. Improving evaluation methods can impact both skills assessment and measure learning efficacy. Personalization can help make learning more efficient by tailoring to each learner’s individual background and goals.

For this year’s workshop, we are highlighting the following areas:

  • Lifelong Learning

    In the past decade, technological advances have made the job market more dynamic. The skills one needs to stay relevant in a job market is constantly changing. This creates a growing incentive to engage in lifelong learning, especially if you want to make a career transitions or keep up with the new requirements of the job market. Encouraging employees to learn at work can also help in employee retention and increasing career satisfaction. In spite of all the benefits of lifelong learning, there has been limited research on understanding these learners, and the impact of learning at work for small or medium-sized businesses. We have an opportunity to study the implications for employers, employees and professionals, including career growth, non-traditional hiring, demographic implications, and the business impact.

  • Assessments

    Assessments are a key aspect of formative learning and summative evaluation. Formative assessments power adaptive learning systems and can make learning more efficient, while summative assessments are important for establishing a learner’s skills and abilities. In the online education setting, it is important for us to have assessments that credibly assess learners in a scalable manner. Data mining and machine learning methods have been employed to assess the quality of assessments, auto-grade assessment questions, and automatically provide quality feedback at scale.

  • Learning Analytics and Personalization

    The abundance of online learning content, and learning data collected about learners as they interact with this content, presents the opportunity to analyze the process of learning and personalize this process to meet each individual’s preferences and needs. Recent studies have explored how learning behavior (e.g., video-watching clickstream interactions) can be quantified in ways that are predictive of performance (e.g., quiz scores), showing promise of constructing low-dimensional user models that can give instructors useful analytics about learners. The other side of learning analytics is personalization where the information gathered about learners can be used to recommend courses or videos or create personalized learning paths.

  • Infrastructure and Services for Learning

    The introduction of scalable learning applications has also pushed the boundaries of corresponding infrastructure serving the need. These infrastructures need to serve data sources of many format - graphs (knowledge and skill graphs, ontologies), structured data (ratings data, multiple choice assessment data) and unstructured data (video transcripts, forum discussion posts, learner responses to open-form questions). Scalable services pose new operational challenges while also presenting new solutions for learning. For example, user generated content presents opportunities to solve interesting data mining problems; we can utilize crowdsourcing algorithms to surfacing popular and highly rated content, recommendations and search enhancements and so on.

We hope this workshop will bring the community together to inspire new ideas for data-driven innovations in education.


  • Andrew Lan (Princeton University)
  • Christopher G. Brinton (Zoomi)
  • Jiquan Ngiam (Coursera)
  • Mung Chiang (Princeton University)
  • Richard Baraniuk (Rice University)
  • Roshan Sumbaly (Coursera)
  • Shivani Rao (LinkedIn)