Hacker News new | past | comments | ask | show | jobs | submit login

4th year EE undegraduate student here, taking both "Data Analysis/Pattern Rec" and "Computer Vision" electives this term. My early courses prepared me more for a path focused in circuit design, but I jumped ship through exposure to wonderful, wonderful DSP. A lot of what I'm learning now is very new to me, so, I appreciate comments like yours that give a sense of potential gaps in my learning. Thank you.

I'm currently working on an assignment for CV in which we extract Histogram of Oriented Gradient features from the CIFAR-10 dataset using python, then use them to train one of three classifiers (SVM, Gaussian Naive Bayes, Logistic Regression). I had asked about preprocessing, but was told it was outside the scope of this assignment, so we're just using the dataset as-is. :(

The nice bit is, I have a research internship coming up in a lab that will have me working on actual datasets, rather than toy examples. And, there's a data science club on campus that has an explicit focus on cleaning data which I plan on regularly attending. So... hopefully I'm on the right track!




Don't worry, when you have real problems you will have time to learn. Most of the time is not even data cleaning, but debugging, getting into the details of the data or code written by somebody else to understand why something is not working (and there's always something that's not working :) ). The main differentiator is whether you have interest / patience for that or not.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: