Entropy is a measurable physical property that is most often associated with a state of chaos, randomness, or ambiguity. The word and definition are used in a wide range of fields, from classical thermodynamics, where it was first known, to statistical physics’ microscopic explanation of existence, and knowledge theory’s principles. It has a wide range of applications in chemistry and physics, biological systems and their relationships to life, cosmology, economics, sociology, weather science, climate change, and information systems, including telecommunications.

**1. You can define Jaccard as the size of the intersection divided by the size of the union of two label sets.**

**True**- False

**2. When building a decision tree, we want to split the nodes in a way that increases entropy and decreases information gain.**

- True
**False**

**3. Which of the following statements are true? (Select all that apply.)**

- K needs to be initialized in K-Nearest Neighbor.
- Supervised learning works on labelled data.
- A high value of K in KNN creates a model that is over-fit
**KNN takes a bunch of unlabelled points and uses them to predict unknown points.**- Unsupervised learning works on unlabelled data.

**4. To calculate a model’s accuracy using the test set, you pass the test set to your model to predict the class labels, and then compare the predicted values with actual values.**

**True**- False

**5. Which is the definition of entropy?**

- The purity of each node in a decition tree.
- Information collected that can increase the level of certainty in a particular prediction.
- The information that is used to randomly select a subset of data.
**The amount of information disorder in the data.**

**6. Which of the following is true about hierarchical linkages?**

**Average linkage is the average distance of each point in one cluster to every point in another cluster**- Complete linkage is the shortest distance between a point in two clusters
- Centroid linkage is the distance between two randomly generated centroids in two clusters
- Single linkage is the distance between any points in two clusters

**7. The goal of regression is to build a model to accurately predict the continues value of a dependent variable for an unknown case.**

**True**- False

**8. Which of the following statements are true about linear regression? (Select all that apply)**

**With linear regression, you can fit a line through the data.****y=a+b_x1 is the equation for a straight line, which can be used to predict the continuous value y.**- In y=θ^T.X, θ is the feature set and X is the “weight vector” or “confidences of the equation”, with both of these terms used interchangeably.

**9. The Sigmoid function is the main part of logistic regression, where Sigmoid of 𝜃^𝑇.𝑋, gives us the probability of a point belonging to a class, instead of the value of y directly.**

**True**- False

**10. In comparison to supervised learning, unsupervised learning has:**

**Less tests (evaluation approaches)**- More models
- A better controlled environment
- More tests (evaluation approaches), but less models

**11. The points that are classified by Density-Based Clustering and do not belong to any cluster, are outliers.**

**True**- False

12. **Which of the following is false about Simple Linear Regression?**

- It does not require tuning parameters
- It is highly interpretable
- It is fast
**It is used for finding outliers**

**13. Which one of the following statements is the most accurate?**

**Machine Learning is the branch of AI that covers the statistical and learning part of artificial intelligence.**- Deep Learning is a branch of Artificial Intelligence where computers learn by being explicitely programmed.
- Artificial Intelligence is a branch of Machine Learning that covers the statistical part of Deep Learning.
- Artificial Intelligence is the branch of Deep Learning that allows us to create models.

**14. Which of the following are types of supervised learning?**

**Classification****Regression****KNN**- K-Means
- Clustering

**15. A Bottom-Up version of hierarchical clustering is known as Divisive clustering. It is a more popular method than the Agglomerative method.**

- True
**False**

**16. Select all the true statements related to Hierarchical clustering and K-Means.**

- Hierarchical clustering does not require the number of clusters to be specified.
**Hierarchical clustering always generates different clusters, whereas k-Means returns the same clusters each time it is run.**- K-Means is more efficient than Hierarchical clustering for large datasets.

**17. What is a content-based recommendation system?**

**Content-based recommendation system tries to recommend items to the users based on their profile built upon their preferences and taste.**- Content-based recommendation system tries to recommend items based on similarity among items.
- Content-based recommendation system tries to recommend items based on the similarity of users when buying, watching, or enjoying something.

**18. Before running Agglomerative clustering, you need to compute a distance/proximity matrix, which is an n by n table of all distances between each data point in each cluster of your dataset.**

**True**- False

**19. Which of the following statements are true about DBSCAN? (Select all that apply)**

**DBSCAN can be used when examining spatial data.**- DBSCAN can be applied to tasks with arbitrary shaped clusters, or clusters within clusters.
- DBSCAN is a hierarchical algorithm that finds core and border points.
- DBSCAN can find any arbitrary shaped cluster without getting affected by noise.

**20. In recommender systems, “cold start” happens when you have a large dataset of users who have rated only a limited number of items.**

- True
**False**

**21. Machine Learning uses algorithms that can learn from data without relying on explicitly programmed methods.**

**True**- False

**22. Which are the two types of Supervised learning techniques?**

- Classification and Clustering
- Classification and K-Means
- Regression and Clustering
- Regression and Partitioning
**Classification and Regression**

**23. Which of the following statements best describes the Python scikit library?**

- A library for scientific and high-performance computation.
**A collection of algorithms and tools for machine learning.**- A popular plotting package that provides 2D plotting as well as 3D plotting.
- A library that provides high-performance, easy to use data structures.
- A collection of numerical algorithms and domain-specific toolboxes.

**24. Train and Test on the Same Dataset might have a high training accuracy, but its out-of-sample accuracy can be low.**

**True**- False

**25. Which of the following matrices can be used to show the results of model accuracy evaluation or the model’s ability to correctly predict or separate the classes?**

**Confusion matrix**- Evaluation matrix
- Accuracy matrix
- Error matrix
- Identity matrix

**26. When we should use Multiple Linear Regression?**

**When we would like to identify the strength of the effect that the independent variables have on a dependent variable.**- When there are multiple dependent variables.

**27. In K-Nearest Neighbors, which of the following is true:**

**A very high value of K (ex. K = 100) produces an overly generalised model, while a very low value of k (ex. k = 1) produces a highly complex model.**- A very high value of K (ex. K = 100) produces a model that is better than a very low value of K (ex. K = 1)
- A very high value of k (ex. k = 100) produces a highly complex model, while a very low value of K (ex. K = 1) produces an overly generalized model.

**28. A classifier with lower log loss has better accuracy.**

**True**- False

**29. When building a decision tree, we want to split the nodes in a way that decreases entropy and increases information gain.**

**True**- False

**30. Which one is NOT TRUE about k-means clustering??**

- k-means divides the data into non-overlapping clusters without any cluster-internal structure.
- The objective of k-means, is to form clusters in such a way that similar samples go into a cluster, and dissimilar samples fall into different clusters.
**As k-means is an iterative algorithm, it guarantees that it will always converge to the global optimum.**

**31. Customer Segmentation is a supervised way of clustering data, based on the similarity of customers to each other.**

- True
**False**

**32. How is a center point (centroid) picked for each cluster in k-means?**

**We can randomly choose some observations out of the data set and use these observations as the initial means.**- We can select the centroid through correlation analysis.

**33. Collaborative filtering is based on relationships between products and people’s rating patterns.**

**True**- False

**34. Which one is TRUE about Content-based recommendation systems?**

**Content-based recommendation system tries to recommend items to the users based on their profile.**- In content-based approach, the recommendation process is based on similarity of users.
**In content-based recommender systems, similarity of users should be measured based on the similarity of the actions of users.**

**35. Which one is correct about user-based and item-based collaborative filtering?**

- In item-based approach, the recommendation is based on profile of a user that shows interest of the user on specific item
**In user-based approach, the recommendation is based on users of the same neighborhood, with whom he/she shares common preferences.**

Recommender systems use a technique called collaborative filtering (CF). Collaborative filtering has two meanings: a specific one and a broader one. Collaborative filtering is a method of making automated assumptions (filtering) about a user’s desires by gathering tastes or taste information from multiple users in a newer, narrower context (collaborating). The collective filtering approach is based on the premise that if two people have the same opinion on an issue, A is more likely to have B’s opinion on a different issue than a randomly chosen individual.