Preparation Guide for DP-100: Designing and Implementing a Data Science Solution on Azure

Suraj Bhala
5 min readJun 18, 2020
** image taken from microsoft.com and my own badge

Microsoft around a year ago released a certification focused on the various Azure tools and components which can be used by Data Scientist to be able to fully utilise the cloud to train, build and deploy Machine Learning models using the different data sources.

This exam is not very easy to crack and does require some well-planned preparation, as it needs you to know about both cloud(Azure) and various data science algorithms and statistical metrics along with a basic knowledge of the deep learning algorithms.

Objective Domain :

  1. Set up an Azure Machine Learning Workspace (30–35%)
  2. Run Experiments and Train Models (25–30%)
  3. Optimize and Manage Models (20–25%)
  4. Deploy and consume Model (20–25%)

Check this document for detailed skills measured: Exam DP-100: Designing and Implementing a Data Science Solution on Azure — Skills Measured

Types and Number of Questions

I gave the exam in June 2020 and these were the various types of questions asked :

I had to solve 55 questions in 180minutes.

  1. Scenario-based questions having single choice
  2. Multiple choice questions (multiple correct answers)
  3. Single correct answer questions
  4. Code completion as filling in the blanks
  5. Arranging in correct sequence

The questions span a variety of topics like (not definitive or exhaustive):

Important Tips

  1. Make sure to practice all the codes from this GitHub repo: https://aka.ms/mslearn-aml-labs
  2. Do some hands-on with Automated Machine Learning and how to create ML Pipelines with Azure
  3. Make sure you understand which technique or tool will be used in what situation and how to use them.
  4. If you are new to data science or want to revise the data science concepts along with the Azure concepts this course from PluralSight is an official study guide for DP-100: https://www.pluralsight.com/paths/microsoft-azure-data-scientist-dp-100
  5. If you are using Azure for the first time then signup for the free account and Microsoft gives some credits to explore basic Azure capabilities and various services offered.
  6. Try to take the exam before 20th of any month as Microsoft usually updates the syllabus around 20–25th
  7. If you can take the exam from a Pearson centre as you will have pen-paper in case you need to write anything
  8. If taking from home/office, make sure that you have a stable internet connection and preferably take in an environment where there will be no disturbance for 3.5hours.
  9. The passing percentage is 70% (700/1000) (by the time I gave the exam in June 2020)

Official study guide :

Visit the DP-100 docs by Microsoft: https://docs.microsoft.com/en-us/learn/certifications/exams/dp-100?wt.mc_id=learningredirect_certs-web-wwl

These are the two Learning Paths which are official study guides :

  1. Create no-code predictive models with Azure Machine Learning (added recently)
  2. Build AI solutions with Azure Machine Learning service

Azure AI Gallery

It is meant for the data scientist and Azure developers to share there analytics solutions, which provides a good understanding of how to use the Azure ML studio and make pipelines. You can start with going through Microsoft experiments followed by experiments submitted by others on various topics like Regression, Classification, Clustering, Recommendation System, Time Series Modelling, NLP applications like text analytics, Data transformation, Forecasting, Word Cloud and many more.

Click on the Experiments (highlighted in Image) and check the experiments in Azure AI Gallery.

Important Resources :

  1. https://docs.microsoft.com/en-us/azure/machine-learning/how-to-tune-hyperparameters (very imp)
  2. https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/overview (very imp)
  3. https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/tools-included (very imp)
  4. Azure Kubernetes service: https://azure.microsoft.com/en-in/services/kubernetes-service/ (very imp)
  5. MLOps: https://docs.microsoft.com/en-us/azure/machine-learning/concept-model-management-and-deployment (imp)
  6. https://docs.microsoft.com/en-us/azure/machine-learning/concept-model-management-and-deployment (imp)
  7. Azure Notebooks: https://notebooks.azure.com/ (imp)
  8. https://docs.microsoft.com/en-us/azure/azure-databricks/what-is-azure-databricks (imp)
  9. Pytorch model training: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-train-pytorch (very imp)
  10. Scikit Learn model training: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-train-scikit-learn(very imp)
  11. Tensorflow model building: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-train-tensorflow(very imp)
  12. Keras model training: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-train-keras (very imp)
  13. Reinforcement Learning: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-use-reinforcement-learning(in preview)
  14. https://docs.microsoft.com/en-us/azure/machine-learning/compare-azure-ml-to-studio-classic (imp)
  15. https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/smote (Data balancing technique )
  16. https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/split-data
  17. How to Partition and Sample the dataset: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/partition-and-sample ( very imp)
  18. Metadata : https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/edit-metadata (imp)
  19. Model Evaluation: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/evaluate-model (very imp)
  20. https://docs.microsoft.com/en-us/azure/machine-learning/concept-onnx
  21. Sentiment Analysis https://docs.microsoft.com/en-us/azure/cognitive-services/text-analytics/how-tos/text-analytics-how-to-sentiment-analysis?tabs=version-3
  22. https://docs.microsoft.com/en-us/azure/machine-learning/concept-model-management-and-deployment
  23. Accessing data: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-access-data(very imp)
  24. Automated ML: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-use-automated-ml-for-ml-models, also read the other links on AutoML(this is a must)
  25. Two Class Neural Networks: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/two-class-neural-network(imp)
  26. Filter based feature selection: https://docs.microsoft.com/bs-latn-ba/azure/machine-learning/studio-module-reference/filter-based-feature-selection (imp)
  27. Hypothesis testing https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/test-hypothesis-using-t-test
  28. https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/permutation-feature-importance
  29. https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/two-class-support-vector-machine
  30. https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/two-class-decision-jungle
  31. https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-and-where (very imp)
  32. https://docs.docker.com/toolbox/toolbox_install_windows/ (imp)
  33. https://docs.microsoft.com/en-us/azure/hdinsight/ (must know the basics and uses)
  34. ML on HDInsights : https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-machine-learning-overview
  35. Apache Spark on HDInsights : https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-overview
  36. https://docs.microsoft.com/en-us/cognitive-toolkit/brainscript-basic-concepts
  37. https://pandas.pydata.org/docs/reference/api/pandas.melt.html(imp)
  38. https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/data-format-conversions (imp)
  39. https://docs.microsoft.com/en-us/azure/azure-functions/functions-overview
  40. Power BI : https://docs.microsoft.com/en-us/power-bi/connect-data/service-azure-and-power-bi (good to know the basics)

If you have further questions, please reach out to me on LinkedIn or ask in the comments below, and if the format or the type of questions change please do let me know.

My Learning

  1. Using Azure to provision machine for ML workflows and modeling
  2. Using Automated ML in Azure
  3. Using Azure ML SDK for Python to build and run workflows with ML service
  4. End to end ML lifecycle starting from data collection, eda, modelling to deploying with the help of Azure

Also if you find this post helpful please share and clap.

All the best for your exam!

--

--