Hacker News new | past | comments | ask | show | jobs | submit login

I basically did what you’re talking about. Masters in physics, then went into semiconductors as engineer and materials scientist, then switched to Data scientist at a bank for 18 months, and now have been a research scientist in AWS for almost two years.

In Amazon, it’s easy to move around, but not between job families. I think it’s a bad idea to join as a SWE and try to transfer because they want people that have done it before, and you’re unlikely to do that sort of work as an SWE. I think it’s better to get experience in the role you want at a less prestigious company. You’ll learn a ton. Pick the best company that will have you.

My personal turning point was when I did free work for a local startup in exchange for them letting me take the Data Scientist title. To a recruiter, it’s totally obvious to hire a data scientist for a data scientist role, and isn’t clear at all what physics has to do with it. Recruiters are the first step when you’re starting fresh, so make it easy for them.

I also somewhat disagree with many of the comments here that textbooks are better than tutorials. If you buy these 1000 page graduate level texts when the idea that you need to read them cover to cover, you’re likely going to give up and fail. Instead, buy the books and put them on the shelf, and then work through tutorials and examples. Then reference particular sections of the book that are relevant to your work to add depth.

Finally, I recommend against starting with deep learning. There’s a whole helluva lot to learn with basic techniques. Very few companies are actually using deep learning in production systems. Start with linear and tree based methods to learn all the stuff about how to frame the problem and build robust systems. Then you’ll have a deeper appreciation for DL.

A reasonable person could disagree and say that there’s so much domain specific stuff around the art of DL that it really behooves you to start there ASAP. I would counter that you’re unlikely to be considered for positions using DL unless you’re pursuing your PhD in it, or have proven yourself in industry. Since that isn’t your situation, I’d wait until you got your foot in the door somewhere and then pursue DL on the side. That’s what I did, and then you look like a hero to your boss. This strategy led to my first publication in the field and I’m now working on DL almost exclusively.

Edit: one more thing. Think carefully about the type of work you want to do. My advice is assuming you’d like to be a person that trains/deploys ML models to solve problems in industry. This is much different than an ML Engineer, who’s implementing algorithms in low level languages and squeezing out efficiency. Obviously that would require a much deeper understanding of SWE. And a totally different person is an academic researcher that’s developing theory or technique. It’ll be hard to do that without a PhD.




Great reply. My question based on this comment,

>My advice is assuming you’d like to be a person that trains/deploys ML models to solve problems in industry. This is much different than an ML Engineer, who’s implementing algorithms in low level languages and squeezing out efficiency. Obviously that would require a much deeper understanding of SWE. And a totally different person is an academic researcher that’s developing theory or technique. It’ll be hard to do that without a PhD

Can one not only train/deploy ML models, but in addition to that be able to implement the algorithms in low level languages and also be able to develop theory?

I’d imagine these are all skill sets that someone in PhD program could pick up.

If they could do all three, what kind of job should they be looking for?


I think it’s unlikely to become expert in all of those things. If you do, it’s over the course of an entire career, not to get started. I guess it comes down to how much expertise is “enough” for you. Naturally, if you split your time across 3 domains you won’t be as expert as someone who dedicated all their time to going deep in one.

In the context of a big company, I think it makes sense to have a specialized workforce. Why look for the one in a billion person that can publish top quality theoretic papers and then implement them on distributed gpus in an optimal way while also building simple Random Forest models for your business? I’d rather that person do more of the most valuable thing, and then hire someone else to do the rest.


This answer makes sense.

I suppose my question is more along the lines of, if someone is specializing in deep learning in a PhD program then shouldn’t they at the very least be able to implement models and also know optimization tricks?

In other words shouldn’t they be able to develop enough skills to go deep in one area but also know enough to be dangerous in the other three domains?


I think I agree with you with the caveat that it would depend on what they're researching. If they're researching new model architectures, I don't think it makes sense for them to try to implement the algorithms from scratch in C++/CUDA to do distributed GPU training--why not just use TensorFlow? But if you're researching distributed tensor computation, then that's your bread and butter.


Great reply. Just out of curiosity, did you end up giving your semiconductor job before joining the start up with a data scientist title ?

I am deep into semiconductors, and am facing the dilemma of giving up my expertise so far, to join a startup as an entry level engineer.

I have done a couple of MOOC specializations and am trying to find projects within my industry to gain some credibility. Also trying to stay active on Kaggle to build some basic data analysis portfolio.


In my particular case, I did quit my job before I got my "real" Data Science position. I can't recall exact timelines, but I think I had already lined up the relationship with the startup.

The reason I did this is because there just wasn't enough hours in the day, and my job was taking ~10 hours a day with commuting...etc. It was a risk, but the idea was that I would be able to transition much more quickly if I worked full-time towards it. I also had the financial savings to support myself for 6-9 months and was willing to get a part time job if necessary. Once it became clear that my job's only purpose was to pay the bills in the context of my goals, and I had enough to pay the bills for the near future, it was clear that quitting was the easiest way to free up a lot of time.

This turned out to be the best decision of my career, but YMMV. I doubled my salary in less than 2 years. It's also nice to be part of an industry that isn't so cost-sensitive. I also have a skill set that's in much higher demand, so you can live almost anywhere and there's a ton of companies that want/need it. With semiconductors, you're much more limited.

It's true that you're giving up some expertise and will start in a less senior position in a different field than if you stayed in semi. Sticking around because you have experience is classic Sunk Cost Fallacy. Think 5 years down the line. If you leave now, you'll have 5 years experience in ML. You'll definitely be giving something up if you leave, but there's huge opportunity cost if you don't leave.


Would you be open to have a brief conversation on the phone for advice? My email is knariks@gmail.com.


Hi OP here. I don't think you need to give up Your Experience to join a startup as a Data scientist. But what won't be 100% transferrable would be your Semiconductor specific skills(think spice, process technology etc). If you have been coding your current job that is a skill that is transferrable.Your PhD is a massive foot in the door. Have you considered something like Data incubator or Insight data science fellowship which require a minimum P.hD to transition.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: