Search Your Question...!

Structured Data Classification MCQ's

1.  Identify the structured data from the following.

   View Answer   

   Data from mySQL DB and Excel



2. What kind of classification is our case study 'Churn Analysis'?

   View Answer   

   Binary 



3. Which command is used to identify the unique values of a column?

   View Answer   

   unique()



4. Which preprocessing technique is used to make the data gaussian with zero mean and unit variance?

   View Answer   

   Standardisation



5. Cross-validation technique is used to evaluate a classifier by dividing the data set into training set to train the classifier and testing set to test the same.

   View Answer   

   True 



6. True Negative is when the predicted instance and the actual is positive.

   View Answer   

   False 



7. True Positive is when the predicted instance and the actual instance is not negative. 

   View Answer   

   True



8. What are the advantages of Naive Bayes?

   View Answer   

   Requires less training data



9. High classification accuracy always indicates a good classifier.

   View Answer   

   True



10. Categorical variables has  

   View Answer   

   no logical order



11. Cross-validation technique will provide accurate results when the training set and the testing set are from two different populations.

   View Answer   

   True



12. Choose the correct sequence for classifier building from the following:

   View Answer   

   Initialize -> Train - -> Predict-->Evaluate



13. Which of the given hyper parameter(s), when increased may cause random forest to over fit the data?

   View Answer   

   Depth of Tree



14. To view the first 3 rows of the dataset, which of the following commands are used?Download the dataset from:https://gist.githubusercontent.com/curran/a08a1080b88344b0c8a7/raw/d546eaee765268bf2f487608c537c05e22e4b221/iris.csv to answer the question. 

   View Answer   

   iris.head(3)



15. Pruning is a technique associated with 

   View Answer   

   Decision tree



16. The commonly used package for machine learning in python is 

   View Answer   

   sklearn



17. A classifer that can compute using numeric as well as categorical values is 

   View Answer   

   Decision Tree Classifier



18. Can we consider sentiment classification as a text classification problem?

   View Answer   

   yes



19. Let's assume, you are solving a classification problem with highly imbalanced class. The majority class is observed 99% of times in the training data. Which of the following is true when your model has 99% accuracy after taking the predictions on test data. ?

   View Answer   

   For imbalanced class problems, accuracy metric is not a good idea.



20. email spam detection is an example of

   View Answer   

    supervised classification



21. A technique used to depict the performance in a tabular form that has 2 dimensions namely “actual” and “predicted” sets of data. 

   View Answer   

   Confusion Matrix



22. What kind of classification is the given case study(IRIS dataset)?Download the dataset from: https://gist.githubusercontent.com/curran/a08a1080b88344b0c8a7/raw/d546eaee765268bf2f487608c537c05e22e4b221/iris.csv to answer the question.

   View Answer   

   Multi class classification



23. Ordinal variables has

   View Answer   

    clear logical order



24. Which command is used to select all NUMERIC types in the dataset.Download the dataset from: https://gist.githubusercontent.com/curran/a08a1080b88344b0c8a7/raw/d546eaee765268bf2f487608c537c05e22e4b221/iris.csv to answer the question. 

   View Answer   

   iris_num = iris_data.select_dtypes(include=[numpy.number])



25. The number of categorical attributes in the original dataset.Download the dataset from: https://gist.githubusercontent.com/curran/a08a1080b88344b0c8a7/raw/d546eaee765268bf2f487608c537c05e22e4b221/iris.csv to answer the question.

   View Answer   

   3



26. Which classifier converges easily with less training data?

   View Answer   

   Naive Bayes Classifier



27. Ensemble learning is used when you build component classifiers that are more accurate and independent from each other.

   View Answer   

   true 



28. clustering is an example of 

   View Answer   

   unsupervised classification



29. Model Tuning helps to increase the accuracy 

   View Answer   

   True



30. Imputing is a strategy to handle 

   View Answer   

   Missing Values



31. classification where each data is mapped to more than one class is called 

   View Answer   

   Binary Classification.



32. The fit(X, y) is used to 

   View Answer   

   Train the Classifier



33. Supervised learning differs from unsupervised learning as supervised learning requires __________

   View Answer   

   Labeled data



34. Clustering is a supervised classification.

   View Answer   

   False



35. Select the correct option which directly achieve multi-class classification (without support of binary classifiers).

   View Answer   

   K Nearest Neighbor



36. The classification where each data is mapped to more than one class is called ___________

   View Answer   

   Multi Label Classification



37. Email spam data is an example of __________

   View Answer   

   unstructed Data



38. The most widely used package for machine learning in Python is _________

   View Answer   

   sklearn



39. Pruning is a technique associated with __________

   View Answer   

   dt



40. What does the command sentiment_analysis_data['label'].value_counts() return?

   View Answer   

   counts of unique values in the 'label' column



41. Select the pre-processing technique(s) from the following.

   View Answer   

   all



42. Which of the given hyper parameter, when increased, may cause random forest to over fit the data?

   View Answer   

   depth of tree



43. Select the correct statement about Nonlinear classification.

   View Answer   

   Kernel tricks are used by Nonlinear classifiers to achieve maximum-margin hyperplanes.



44. Choose the correct sequence for classifier building from the following.

   View Answer   

   Initialize -> Train - -> Predict-->Evaluate



45. What command should be given to tokenize a sentence into words?

   View Answer   

   from nltk.tokenize import word_tokenize, Word_tokens =word_tokenize(sentence)



46. Choose the correct sequence from the following.

   View Answer   

   Data Analysis -> PreProcessing -> Model Building--> Predict



47. The following are all classification techniques, except ___________

   View Answer   

   StratifiedShuffleSplit



48. The commonly used package for machine learning in python is 

   View Answer   

   sklearn



49. How many new columns does the following command return?

   View Answer   

   iris_series = pd.get_dummies(iris['Species'])



50. Download the dataset from: https://gist.githubusercontent.com/curran/a08a1080b88344b0c8a7/raw/d546eaee765268bf2f487608c537c05e22e4b221/iris.csv to answer the question.

   View Answer   

   3



51. Naive Bayes Algorithm is useful for :

   View Answer   

   indepth analysis



52. A process used to identify data points that are simply unusual 

   View Answer   

   Anomaly Detection



53. Is there a class imbalance problem in the given data set?

   View Answer   

   no 



54. Which of the following is not a technique to process missing values?

   View Answer   

   One hot encoding



55. Images,documents are examples of 

   View Answer   

   Unstructured Data



56. email spam detection is an example of 

   View Answer   

   The count with unique values in the iris['species'] column



57. Choose the correct sequence for classifier building from the following:

   View Answer   

   Initialize -> Train -> Predict -> Evaluate



58. Identify the command used to view the dataset SIZE and what is the value returned?Download the dataset from: https://gist.githubusercontent.com/curran/a08a1080b88344b0c8a7/raw/d546eaee765268bf2f487608c537c05e22e4b221/iris.csv to answer the question.

   View Answer   

   iris.shape,(150,6)



59. Which type of cross validation is used for imbalanced dataset?

   View Answer   

   K fold 



60. To view the first 3 rows of the dataset, which of the following commands are used?Download the dataset from: https://gist.githubusercontent.com/curran/a08a1080b88344b0c8a7/raw/d546eaee765268bf2f487608c537c05e22e4b221/iris.csv to answer the question.

   View Answer   

   iris.head(3)



61. Imagine you have just finished training a decision tree for spam classication and it is showing abnormal bad performance on both your training and test sets. Assume that your implementation has no bugs. What could be reason for this problem.

   View Answer   

   You are overfitting



No comments:

Post a Comment