Scaling Automated CS Education

The success of Salman Kahn’s Academy and other instances of disruptive education, have started me thinking about how computer science education might scale. Let’s first analyze how Kahn is organizing the learning experience.

First: Have a huge collection of videos. Kahn’s library has been organically grown. Each video introduces only a single topic, through the use of an example. The videos are never longer that 15 minutes, so the example and explanation have to be concise and straightforward. Any context or motivation for the topic/idea need to be embedded in the example itself, and do not warrant their own digression. Extended lecture is one of the least effective mechanisms for learning, so the short video format works well. Additionally, the video only has to be made once, and it can be repeated as often as necessary for every student.

Second: Map out the dependencies. Once the video library grew large enough it has to be organized. Some subjects, such as mathematics, have a clear dependency chain of topics. But all subjects can be broken down and categorized into a network of interdependent topics. This web is not necessarily topologically order-able, but an ordering roughly corresponding to historical development likely minimizes the forward references and digressions to another leaf that occur when two seemingly unrelated topics need to be introduced before they are later unified. The mapping itself has a visualization.

Third: Merit badges for topic understanding. If the discipline itself can be broken down into separate topics, then you can develop exercises that test each individual concept. Mastery of each topic can be measured by performance on the topic exercises. Each time a student can prove the ‘get it’ a merit badge can be awarded that unlocks access to the more abstract, more complex, more involved topics that follow. This ensures that each student builds the skills needed before they continue. Everyone can develop mastery at their own pace, and we don’t have to keep a strict schedule. Understanding and development become the focus while examination and assessment are de-emphasized. You can visualize your achievements and progress in the topic dependency visualization.

Fourth: Lack of penalization for failure. Each student is allowed to develop at their own pace, watch the topic video as many times as necessary, and try the exercises as many times as necessary until they ‘get it’. No ‘Fail’ or other negative mark appears on your permanent record just because you were unable to keep pace with the rest of the class and the teachers schedule. You are free to go back and review topics that you have forgotten, and will likely do so because the video is less than 15 minutes. Understanding the material to collect merit badges becomes its own reward.

That sounds like a wonderful system! And it works out quite nicely for subjects that have short, simple exercises that can assess individual isolated topics. But for subjects other than math/science I’m not sure that it works quite as well. There might be issues scaling this educational framework.

Apply the above organization to Computer Science. We get something like CodeAcademy. It doesn’t have a graphical visualization of the dependency map of topics, but it does have a way to track and share your progress. The exercises are simple, but I find them too knowledge-based. Yes, you can easily learn how for loops work, and the interface encourages exploration. But, it doesn’t teach the most important part of programming skill: design trade-offs and decision making.

An automated system is pretty good at teaching the basics: types, variables, memory, assignments, loops, conditionals, etc. Yes, it can be gamed and deceived, but if you do that you’re probably have more understanding than the automated assessment expects. But it’s not good at giving design feedback. The exercises don’t scale. It doesn’t have a way for me to learn the GoF Design Patterns. It doesn’t have a library of common solutions for common problems in that particular language. It doesn’t tell me that I have to practice good indentation and naming lest I confound myself. It doesn’t tell me that I should make many small functions with good names, that I should handle edge cases first to get the out of the way, that I should prefer local variables over global variables. In short, it allows me to acquire coding habits (like use of global variables and flags that control loops) that work on small exercises but will ultimately prevent me from becoming a professional developer.

If we are going to automate our education by breaking it up into small pieces, we have to make sure that we also teach how those pieces interconnect. It’s not enough to know all the different Lego pieces, I also have to know how to assemble them together. Unfortunately, design issues are much fuzzier. The trade-offs are trickier. The feedback is more about understanding explanation and reasons why vs rote knowledge and brute-force logic.

People tend to ignore their designs until they run into maintenance problems, and only then do they begin to desire and learn about code organization and design. Only when they’ve created complexity beyond their capacity to manage it, do they feel a need for a way out. So, our educational system has to have built into it, a mechanism for waiting on the student to be ready for the knowledge. For example, as Steve Yegge recounts in his reading of Fowler’s book on Refactoring, even smart developers that continually practice their craft can be blind for years before realizing this need.

I’ll quit with a question: How can you automate feedback about design trade-offs and code organization?