Search Your Question...!

Unstructured Data Classification MCQ's

1.  Identify the unstructured data from the following.

   View Answer   

   IMAGE



2.  What kind of classification is our case study 'Spam Detection'?

   View Answer   

   BINARY



3.  Which preprocessing technique is used to remove the most commonly used words?

   View Answer   

   STOPWORDS



4.  Cross-validation technique is used to evaluate a classifier by dividing the data set into training set to train the classifier and testing set to test the same.

   View Answer   

   TRUE



5.  True Negative is when the predicted instance and the actual instance is positive.

   View Answer   

   FALSE



6. True Positive is when the predicted instance and the actual instance is not negative.

   View Answer   

   TRUE



7. TF and IDF use matrix representations.

   View Answer   

   TRUE



8. Which of the following command is used to view the dataset SIZE and what is the value returned?

   View Answer   

   sentiment_analysis_data.shape(),(7086, 2)



9. What command should be given to tokenize a sentence into words?

   View Answer   

   from nltk.tokenize import word_tokenize, Word_tokens =word_tokenize(sentence)



10. TF-IDF is a feature extraction technique.

   View Answer   

   True



11. What is the purpose of lemmatization?

   View Answer   

   To convert words into a proper base form



12. Which of the following is not a preprocessing method used for unstructured data classification?

   View Answer   

   confusion_matrix



13. Stemming and lemmatization gives the same result.

   View Answer   

   False



14. In a Document Term Matrix (DTM) each row represents _______

   View Answer   

   TF VALUE



15. The fit (X, y) is used to __________

   View Answer   

   Train the classifier



16. Can we consider sentiment classification as a text classification problem?

   View Answer   

   YES



17. CHigh classification accuracy always indicates a good classifier.

   View Answer   

   TRUE



18. A classifier that can compute using numeric as well as categorical values is __________

   View Answer   

   NB



19. The following are performance evaluation measures, except __________

   View Answer   

   DecisionTree



20. Which NLP technique uses lexical knowledge base to obtain the correct base form of the words?

   View Answer   

   lemmatization



21. An algorithm that counts how many times a word appears in a document is __________

   View Answer   

   TF-IDF



22. What is the output of the sentence “Good words bring good feelings to the heart” after performing tokenization, lemmatization and stop word removal?

   View Answer   

   'Good word bring good feeling heart'



23. Supervised learning differs from unsupervised learning as supervised learning requires __________

   View Answer   

   Labeled data



24. SClustering is a supervised classification.

   View Answer   

   False



25. Select the correct option which directly achieve multi-class classification (without support of binary classifiers).

   View Answer   

   K Nearest Neighbor



26. The classification where each data is mapped to more than one class is called ___________.

   View Answer   

   Multi Label Classification



27. Email spam data is an example of __________.

   View Answer   

   unstructured Data



28. The most widely used package for machine learning in Python is _________.

   View Answer   

   sklearn



29. Pruning is a technique associated with __________.

   View Answer   

   dt



30. What does the command sentiment_analysis_data['label'].value_counts() return?

   View Answer   

   counts of unique values in the 'label' column



31. Select the pre-processing technique(s) from the following.

   View Answer   

   all



32. Which of the given hyper parameter, when increased, may cause random forest to over fit the data?

   View Answer   

   depth of tree



33. Select the correct statement about Nonlinear classification.

   View Answer   

   Kernel tricks are used by Nonlinear classifiers to achieve maximum-margin hyperplanes.



34.  Choose the correct sequence for classifier building from the following.

   View Answer   

   Initialize -- Train - Predict--Evaluate.



35.  What command should be given to tokenize a sentence into words?

   View Answer   

   from nltk.tokenize import word_tokenize, Word_tokens =word_tokenize(sentence).



36.  Choose the correct sequence from the following.

   View Answer   

   Data Analysis -> PreProcessing -> Model Building--> Predict.



37. The following are all classification techniques, except ___________

   View Answer   

   StratifiedShuffleSplit.



No comments:

Post a Comment