What's the difference between the Data Analyst and Machine Learning Engineer Nanodegrees at...

Quora Feeds

Active Member
Melvin Dunn


The answer you want:

Data Analysts usually do not build models. They explore data, and generally use descriptive statistics. Sometimes they’ll use advanced tests like correlations and other multivariate tests. Machine Learning engineers build statistical models, from my understanding. They may even program models into applications, which makes more sense given the title. Data Scientists do both and much more. If you really want to do the Udacity program, do the Machine Learning Engineer.

The answer you need:

If you’re simply interested in building predictive models, I would take Andrew Ng’s course. It’s very intuitive, and I have no idea what your background is:

Machine Learning - Stanford University | Coursera

Also, I’ve spent a considerable portion of my natural life on Udacity, since I’m in Georgia Tech’s OMSCS program. Most of the Machine Learning related courses don’t have any step-by-steps for building models (Maybe Deep Learning?). They explain the theory behind them. Same goes for the Stanford course. Even though I also teach Data Science for a company like Udacity, I’d recommend not throwing your money at institutions in order to achieve the American dream. Don’t get me wrong, I love Udacity and support them. I just don’t want to steer you or any other person reading this the wrong way.

If you wish to build models and become a real Data Scientist:

  1. Take Andrew Ng’s course
  2. If you program:
    1. Download Python. You can also download R. People in the field usually know both in Data Science.
    2. There are many packages in R all scattered about. Many made by Hadley Wickham. In Python, you should probably download scikit-learn, scipy, numpy, and pandas packages. Install pip if you haven’t already.
      1. Sometimes, you’ll need to learn other languages to implement your models into applications. You should probably be ready for this at some point in your life if you chose to work in this field. Or you could hit up yhat - End-to-End Data Science Platform . (Not a shameless plug)
  3. If you don’t program:
    1. Download WEKA. It has base tools discussed in the course I mentioned earlier that you should have watched. Google WEKA. It’s the GUI tool with the cute bird from the New Zealand University, you can’t miss it.
    2. When you’re done playing with WEKA you eventually need to learn how to program.
  4. Play around with datasets at UCI. Try a bunch of models you learned in the course you should have watched. UCI Machine Learning Repository
    1. Become overconfident, thinking that Machine Learning is easy because this is toy data.
  5. Go forward with your overconfidence. Go to Kaggle, Your Home for Data Science, and start competing.
    1. You wonder why you haven’t gotten in the top 10% yet with your model.
    2. Time goes by, and you start to research how to win.
    3. You realize that there are people with souls in an industry, willing to help others and give back to the community, and are pleasantly surprised. You read their blogs: Kaggle Ensembling Guide . You also watch the forums like an eagle or a hawk.
    4. You utilize their examples and continue learning. After trial and error you begin to become somewhat of a professional in the field.
    5. You try to hack other people’s scripts to get a higher leader board score, but then soon realize that you are subhuman and should actually learn instead of being a hack.
    6. Time goes by. You either give up or continue your journey. You live the life of the video game sprite after the “Continue?” button on Arcade games.
      1. It’s at this point you’ve probably decided you want to be in this field, because it’s wonderful and interesting, not just because it pays really well.
  6. You decide that you want a job. Luckily, your Kaggle endeavor is actually considered experience by job-givers (keep in mind that there are other ML competitions - Numerai etc.)
  7. Depending on the company, you’ll need an advanced degree. Sometimes this is necessary, sometimes people think that if it has the title of “Scientist” in it, you at least need a graduate degree. It’s a fairly new position title, so go figure.
    1. At the very least, you may need to 1. Be working on an advanced degree in Statistics, Math, or Computer Science or 2. Have a Bachelor’s in one.
      1. Do other things besides Kaggle if you can get your foot in the door. Teach, volunteer, do whatever you can.
      2. If you actually want a job, spam your resume fearlessly. Half the time, people don’t even read cover letters, and the algorithms for scanning “good” candidates are, for lack of a better term, terrible. Be sure to put plenty of obnoxious keywords in your resume so employers call you back. Feel free to still apply, even though you don’t have an advanced degree. You may still be a strong candidate.
  8. Remember that their are other skills a Data Scientist should have. You should probably know something about Natural Language. NLTK and gensim packages in Python are a good resource. There are many competitions that involve types of data besides numeric observational data sets. You’ll need to get privy with images at some point as well. Learn about Deep Learning and Computer Vision. Reinforcement Learning has become popular because Google made it sexy, but more importantly, it covers the decision making portion of AI. Although the testing is mostly used in a game environment, perhaps you can find ways to use it in business applications?
    1. Never stop learning. New techniques and algorithms come out every day.
  9. Lastly, don’t ignore exploratory data analysis because you think it’s beneath you, it helps with feature engineering immensely.
  10. If you have no idea what I’m saying throughout this you should Google it.


See Questions On Quora

Continue reading...
 

Similar threads

Top