$18.20
Part I – Short Answer (Show Points/Results) – 5 points each, 40 points total 1. Given the following feature vector , what would a categorical representation of this feature vector be if we assumed discrete categories with values as , as , and as ? 2. Given a binary classification problem with classes , draw a Confusion Matrix showing result counts in terms of Predicted and Actual class. Provide calculations for Accuracy and Error Rate, highlighting False Positives and False Negatives as functions of these result counts. 3. For frequent itemsets , show the difference between the Confidence vs the Interest Factor (Lift) for the Association Rule . What value does Lift take into account that Confidence does not? x = [4.4,5.1, − 3.7,2.1, − 1.9] 3 x ≤ − 2.5 A −2.5 < x < 2.5 B x ≥ 2.5 C {C1,C2} ( f 11, f 10, f 1+, f 0+, . . . ) (FP, FN ) {{A, B}, {C}} c {A, B} ⟹ {C} 4. Given a dataset with observations, what is the size of the training set if we choose to hold out records as a test set? If we allow for , what does the corresponding training set size approach? 5. With a data set containing features and observations, what is the dimensionality of the covariance matrix of the predictors? If we were to represent the predictors with a multivariate normal (Gaussian) distribution, how many distribution parameters would need to be estimated from the feature data? 6. Given the following point observations: and , what would the length of each vector in terms of the Manhattan and Euclidean norms be defined as? Would the distance between the two points be larger under the or norm? n n k k → n d = 15 N = 12,000 x1 = [3,4] x2 = [5,12] (L1, L2) L1 L2 7. Draw the 2-way contingency table for a binary association rule , containing presence/absence counts . Interest Factor (Lift) can be interpreted as a conditional probability , show this probability in terms of these counts. 8. For a binary association rule , show that the coefficient for the rule’s correlation measure is not invariant under null addition (unchanged with added unrelated data) in terms of changes to the relevant counts . {A} ⟹ {B} ( f 11, f 10, f 1+, f 0+, . . . ) P(A, B) P(A)P(B) {a} ⟹ {b} ϕ ( f 11, f 10, f 1+, f 0+, . . . ) Part II – Long Answer (Show Reasoning/Calculations) – 10 points each, 40 points total 1. Show the cosine similarity of the two vectors and . Results can be kept in formula form in terms of the component values of and (calculation of final value not required). 2. Given a classifier with True Positives/Negatives and False Positives/ Negatives , what is the highest Recall value that a model can achieve? Define the Recall measure via . How can one design a simple model which achieves the maximum value for Recall? x = [3,4,5] y = [5,12,13] x y (TP, TN ) (FP, FN ) r (TP, TN, FP, FN ) 3. Given the following transactions: , with , what itemsets would be frequent? What would be the support of the association rule: be? What would the confidence of this rule be? Given the value, would this be a valid rule that is extracted via the Apriori Algorithm? 4. Given a data matrix with features/columns with a total variance of 100, an analyst performs a PCA via eigenvalue decomposition, with the resulting eigenvalues as . If the analyst wishes to reduce dimensionality with of variance explained, how many dimensions would the analyst be able to reduce their selection to? What would be the standard deviations of the data for each these selected dimensions? {a, b, c}, {a, c}, {b, c}, {a}, {b}, {c} minsup = 60 % s {a} ⟹ {c} c minsup D d = 5 [35,25,20,15,5] 80 % σi Part III – Essay Question (Show Argument/Proof) – 20 points each, 20 points total 1. Given a decision tree node containing records, half of which belong to Class and the other half which belong to Class , show the impurity of the node under the Entropy, Gini, and Misclassification Error measures. What would be the value of these measures be for the child nodes, assuming an optimal split is performed? (Hint: Assume ). 10 CA CB I 0 log2 0 = 0 Lucky 7 – Bonus Questions (Industry News, AI/ML Topics) – 1 point each, 7 points total 1. What model recently released by DeepMind allows for accurate prediction of 3- dimensional shape of a protein molecule given input amino acids? 2. Which firm recently fired its head of AI ethics, shortly after the controversial departure of one of its senior researchers? 3. What family of algorithms were recently developed which are able to solve classic treasure hunting video games such as Pitfall on Atari? 4. What disease was IBM able to predict the onset of based on changes in writing/ language via the use of machine learning models? 5. What category of modified videos did a consortium led by Facebook/Microsoft/ Cornell/MIT recently introduce a detection challenge for? 6. Which firm recently released a new image recognition algorithm that was trained on over 1 billion images, but did not require manual labels? 7. What quantum computing goal was recently achieved by Google which was revealed to the public via NASA?