Untitled

Possible answers: Yes, No or Maybe (needs more refinement)

1) The description below is for an algorithm that is used in machine learning. Is it a classification algorithm?
1. A linear function is used to compute a predicted output, from a given input
2. The linear function predicts the probability that the given input is an instance of a successful event.
3. Our model is trained on training data to find the the best coefficients for the linear function.
4. Given new data, the model will tell us the probability that the new instances will successful or unsuccessful based on a probability baseline of 50%

Answer: Yes

Reason: We are teaching a learning agent to predict if new instances belong to different classes. In this case, it is a binary classification because there are only 2 classes: successful or unsuccessful. In fact, this is a very high level description of Logistic Regression (a statistical regression technique that is used in machine learning for classification tasks, because it predicts classes, not specific values).

Question 2) Given the loudness of a song, I want my model to predict the estimated number of beats per minute that it will have. Is this classification?

Answer: No

Reason: This is not a classification task because whilst BPM takes discrete values, it is theoretically a countably infinite measure - they do not form a finite set of discrete values which have sufficient numbers of instance for each number (or proposed “category”) of BPM. It is more easily measured by a model which predicts a value (regression) than a model that predicts a category.


Question 3) Given the historical performance of three teams, I want my model to predict their ranking in a competition this coming year. Is this classification?

Answer: Maybe (needs more refinement)

Reason: This depends on the approach taken to the problem and its context. If the competition uses a consistent scoring metric, we can use the historical data to predict the score and then deduce the team ranking. However, that would not be a classification task. If we are trying to predict whether each team’s likelihood of winning the competition and then deducing the ranking from their probability of winning, then it would be a binary classification task with probabilities of confidence.


Question 4) Which of these encoding methods is not useful for generating ONE new column of numerical categories from worded categories?


- Find and Replace
- One Hot Encoding - CORRECT ANSWER
- Scikit-learn’s Label Encoder