Hacker News new | past | comments | ask | show | jobs | submit login
General Assembly's Data Science Course (github.com/justmarkham)
143 points by Jasamba on April 1, 2016 | hide | past | favorite | 14 comments



Hi all! I'm the owner of the GitHub repo linked above. It's called "DAT8" because it was the 8th session of General Assembly's part-time Data Science course in Washington, DC.

Here's a bit of history, if you're interested:

- General Assembly launched a standardized Data Science course curriculum in mid-2013. The course used lots of slides, and taught both Python and R.

- Over time, the curriculum evolved in different directions as instructors around the world made their own modifications, both in terms of what was taught and how it was taught. By early 2015, I think all instructors had moved to teaching Python only.

- I taught the course five times starting in mid-2014. Over the course of those five sessions, I rewrote nearly all of the lessons and converted most of them to IPython/Jupyter notebooks. I would estimate that 95% of the material in the DAT8 repo is my original lessons.

- Earlier this year (2016), General Assembly rewrote and relaunched a standardized Data Science course curriculum. Therefore, if you take this course in the future, I would expect that you will be learning from that new curriculum.

For what it's worth, I wrote a 4000-word essay [1] about data science education after teaching the course the first time.

I'm happy to answer any questions!

[1] http://www.dataschool.io/teaching-data-science/


Really appreciate your comments here. And your essay (and the larger dataschool.io site), too. GA's data science class is one of several that I'm considering signing up for -- and I'm also considering applying to be an instructor. I think it's really promising that GA has instructors like you that seem to have both the interest in and latitude to make the course as enriching an experience as possible.

If you'd be open to answering some questions about your experience as an instructor, there's a few other things I'm curious about:

- What were your students like? Were they all qualified to be there? Were there challenges dealing with outliers (either overqualified or under)?

- What surprised or stretched you about your experience teaching at GA?

- What background did you bring to it before teaching there?

If you'd rather answer privately, email's in my profile.


I attended the first GA Data Science course to be offered in the DC area in the spring of 2014.

Back then, the course was 3 months long, two nights a week, from 7PM to 10PM on Tuesday and Thursday nights. The first part was all R, and the second part was all Python (Scikit learn, etc).

I got a lot out of it, but it was a TON of work. Outside of class, I was spending 2 - 3 hours a night doing homework. The teacher was great, and he made extensive use of Git for all assignments. Homework was submitted via commits to a class repo.

Other students who came into the course got less out of it, mainly because they didn't have ANY background in programming, or because they simply didn't put in the work.

Overall, I think it was awesome for me, but it's like anything else: You get out of it what you put into it.

Edit:

I just noticed the repo was created by Kevin Markham! He was in my class, and the dude is friggin brilliant!


JPKab/John! Hilarious to cross paths with you again in the Hacker News comments :)

Your compliment is appreciated, but you are way too generous!

So, after taking the first session of the course in DC (internally called "DAT1"), I was a TA for DAT2, the co-instructor for DAT3 and DAT4 and DAT5, and the solo instructor for DAT7 and DAT8. The repo linked above was my repo for DAT8.

The curriculum evolved quite a bit since you took the course. One major change is that by DAT3, we had moved to Python only. We had an increasing number of students without a programming background, which was one of the reasons for the shift to a one-language course.

And I totally agree: Students get out of it what they put into it.


This is exactly my experience too with the GA NYC class. The classes were a great overview of the different tools a Data Scientist has in their toolbox but the real learning came from homework projects.


I work at General Assembly as an Engineering Manager - check out our listing in the Who's Hiring thread for onsite and remote engineering positions, but also for teaching opportunities at our various campuses, and plenty of other roles.


Looks amazing but I doubt I could follow the curriculum without some class videos and feedback on homework/projects.

Unfortunately they don't teach in my area and the online courses are much higher level.


Former GA instructor here.

I kept a LOT of stuff (~300 repos) on Github for student assignments when I taught WDI, all public. I liked keeping it public by default, and advocated internally for such for a few reasons; I didn't think there was anything much in the repos that needed to be kept private, and I liked student's contributions via PRs for turning in assignments to be public so that they could show a strong history of doing stuff on github to future employers. I'm also a huge FOSS advocate, and like pushing for things to be public when possible as a default.

Some of the assignments could in theory be externally consumed, but I put exactly zero time into making it more usable for people not actually in the class.


watty: I am the author of the repo, and I hear your concerns. After finishing teaching with General Assembly, I launched my own online course [1] because I wanted to provide a classroom-like experience to online students, and very few companies seemed to be doing that. By "classroom-like", I mean that I teach classes live (via webinar), hold live office hours (via Slack), assign pre-class readings, and assign homeworks (and provide feedback to students).

It will be interesting to see whether other players in the online education space start to provide similar offerings. The most similar structure I have found is companies [2][3] providing a set curriculum (via videos and written materials) along with a mentor you work with one-on-one.

[1] http://www.dataschool.io/learn/

[2] https://www.springboard.com/

[3] https://www.thinkful.com/


Stay tuned for new online offerings in the future that may meet your needs.


This seems pretty cool! I've been looking for something like this. I really enjoy the open source courses on github, because I can do them at my own pace and figure things out.


ybrah: Great! I tried to write the IPython notebooks so that they could be understood without too much further explanation. As well, all of the exercises and homeworks have provided solutions. Feel free to let me know if you have any questions.


Can anyone who attended the course comment on the quality? This looks very promising to me.


I wrote and taught GA's first data science curriculum [0], back when they were a coworking space in NYC. I'm proud to say that out of 20 students, about half got promotions, new jobs, raised significant capital, or had successful acquisitions. Whether the course was a causal factor... it appears to have been correlated.

Students who didn't have time to do much homework quickly lost interest in the course. I wasn't experienced enough at teaching to handle this. I've become a much better instructor since that first time -- at designing curricula, lecturing, managing a diverse classroom, etc. So, I'd say the quality of the class will be highly instructor-dependent.

[0] http://selik.org/2012/07/04/teaching-data-science/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: