Alan Zhao

Jul 23, 2016

Thoughts on Coursera's Algorithms and Data Structures

Motivation

After spending the past three years largely independently learning programming with a "just make it work" mentality, I decided in January 2016 to formalize my knowledge with an actual course. Massive open online course companies like Coursera, EdX, and Codecademy offer lots of programming courses, but the majority of these are introductory-level or application-specific (like data-analysis with Python or web development with Ruby). I wanted something that would be a general "next level" course but also one that I could do with Python. I talked to software engineer friends and looked at some universities' syllabi and found that the Data Structures and Algorithms topics were the standard 2nd or 3rd course.

Coursera had an entire 6 month Data Structures and Algorithms specialization and it (mostly) fit before my graduate school began so I signed up. I liked that the class would be going in-depth, with 5 separate courses and also a "capstone" applied project. It also allowed for numerous languages (Ruby, Python 2&3, C, C++, Java) but only officially supported Java, C and Python 3 with starter files. To motivate myself further to actually complete the course, I paid the ~400 dollars to get the course verified (and for the nifty certificates on my linkedin).

Course Structure

Unsurprisingly, the course philosophy is learning through doing. You watch 1-2 hours of lectures, with some embedded quizzes, and then code solutions to the problem set. The solutions are submitted to a cold, inhuman grader that compiles your code and checks against 15+ test cases. There's also a discussion session to post questions and answers. The homeworks are well designed and closely follow the lectures. I did find that the coursework typically took double the amount of time estimated (6-8 vs 3-4 hours). The courses are pre-recorded and each one follows a set session. Missing one is fine as you can restart it, but you only have one full year from payment to finish the 6-month specialization.

Learnings

Testing

Learning how to design test cases and automate them was the most valuable takeaway from the course. The test cases are hidden beyond the first 3, so you need to become adept at implementing test cases to pass. Running tests manually becomes incredibly annoying, so I got much more comfortable with the assert statement. The course introduced me to the idea of testing corner cases with manually created cases, and then automated testing with random inputs (and brute-force calculated correct outputs).

Once I saw the time and headache save from rigorous testing, I started implementing testing at work. Prior to the course, my code base utilized integration testing not unit testing. Afterwards, I made it a team project to go back and write unit tests and the amount of blocker issues we uncovered was incredible.

Pseudocode Literacy

I never read formal pseudocode before - shocking, I know. Pseudocode was intimidating, and I just avoided it. You can't avoid it in this specialization though: the course is language-agnostic so the lingua franca is pseudocode. Every lecture has pseudocode, so every week involves translating what's conceptually laid out into code. This skill greatly increased my ability to pick up technical documentation on code-agnostic places like wikipedia.

Immediate Applicability

Algorithms and Data Structures sounded more like conceptual learnings than something helpful in my day to day at work. I was wrong about that. Learning "memoization" (giving your program memory of past results) as part of dynamic programming immediately gave me insight on how to speed up a database call that made redundant calculations. Implementing it took less than two days and cut down a query run 10x/day down from 5-10 minutes to 1-2 minutes. Sounds simple, but I had never heard of the concept before. However, because the class is general, not applications focused, you're going to need to figure out the applications yourself. I still haven't figured out applciations for all those graph algorithms or self-balancing trees.

So What's Missing?

Declining Enrollment

The course started off with high participation that gradually declined as we advanced through courses. In the first course, Algorithmic Toolkit, we had several thousand students across the world according to a live world map with student populations. The forums were active; every question I had while doing the homeworks was already asked and answered on the forums.

It was a different story by the third course, Algorithms on Strings. From forum activity, I estimate only several hundred people are taking this course. One question that I posted only got <10 views after several days with no response. Coincidentally, the world map showing classmate numbers is also no longer on the course side. I wish I had taken a screenshot of the original world map with student populations to prove this! The enrollment decline is expected from the increasing difficulty of the course and the extended commitment to stay on track. To be honest, I'm one of those students who's fallen behind. After completing the first two courses, I wasn't able to complete Algorithms on Strings on time and am now doing it with the second session. I hope more students regroup with me; the active discussion is key to learning.

Python's Limitations

I love Python because it's an abstracted high-level language. Unfortunately, this makes it difficult to implement many of the data structures because they don't exist natively in the language. Take the Python list object: it's easy to understand and use to build applications because of it's flexibility. The downside is that you need to simplify it or use it other ways to use it as a linked list or queue structure.

This lack of native support for "low-level" Python also meant a greater lack of resources than what I expected. Most supplemental resources I found outside the class were exclusively C or Java, so I relied heavily on reading pseudocode and stack overflow Python questions. A great find was this free Data Structures and Algorithms textbook written in Python. Roughly 70% of the first two courses was covered in these books.

After all that's said and done, I do thank Coursera for including Python as a supported course. I would never have taken on this course if it required learning a whole new language.

Hope this was helpful. Now on to finishing Algorithms on Strings!