Confuse Skull: Machine Learning Using scikit-learn MCQ's

1. Which of the following utility of Pandas can be used to read from Oracle database?

View Answer

read_sql

2. What do the methods starting with fetch, of sklearn.datasets module do?

View Answer

It downloads a specific dataset from a library.

3. Which of the following function is used for loading famous iris dataset from sklearn.datasets?

View Answer

load_iris()

4. Which of the following Python library is used for Machine Learning?

View Answer

Scikit-Learn

5. Which of the following module of sklearn contains popular datasets, which are processed?

View Answer

datasets

6. What is the type of iris variable, shown in the below expression?

View Answer

from sklearn import datasets
iris = datasets.load_iris()
sklearn.datasets.base.Bunch

7. What value does the attribute DESCR of a specific loaded dataset contain?

View Answer

Contains a description of the loaded dataset

8. scikit-learn provides utilities for building artificial datasets.

View Answer

True

9. Which of the following library is widely used to read data from external sources with structured data?

View Answer

Pandas

10. Which of the following expressions can access the features of the iris dataset, shown in the below expression?

View Answer

from sklearn import datasets
iris = datasets.load_iris()
iris.data(doubt)

11. Scikit-learn provides Pipeline utility to build a pipeline, which performs a series of transformations.

View Answer

True

12. The preprocessing technique in which categorical values are transformed to categorical integers is known as ________________.

View Answer

Encoding

13. The preprocessing technique in which missing values are replaced with the mean of a dataset is known as _______________.

View Answer

Imputing

14. The preprocessing technique in which a dataset is transformed to a distribution of mean 0 and variance 1 is known as __________________.

View Answer

Mean removal

15. Which of the following API is used to normalize a sample to the unit norm?

View Answer

Normalize

16. Which of the following API is used to scale a dataset to range 0 and 1?

View Answer

MinMaxScaler

17. import sklearn.preprocessing as preprocessing
regions = ['HYD', 'CHN', 'MUM', 'HYD', 'KOL', 'CHN']
print(preprocessing.LabelEncoder().fit(regions).transform(regions))

View Answer

[1 0 3 1 2 0]

18. Which of the following module of sklearn contains preprocessing utilities?

View Answer

Preprocessing

19. ________ parameter is used to control the number of neighbors of KNearestClassifier.

View Answer

n_neighbors

20. Which regressor utility of sklearn.neighbors is used to learn from k nearest neighbors of each query point?

View Answer

KNeighborsRegressor

21. Which of the following parameter can be used to give more weightage to the points, which are nearer to a point in the nearest neighbors method?

View Answer

weights

22. Which of the following class is used to implement the K-Nearest Neighbors classification in scikit-learn?

View Answer

KNeighborsClassifier

23. Which of the following algorithms can be used with any nearest neighbors utility in scikit-learn?

View Answer

all

24. Which of the following is an essential parameter of RadiusNeighborsClassifier?

View Answer

radius

25. Neighbors-based regression is mainly used when the data labels are continuous rather than discrete variables.

View Answer

True

26. What is the strategy followed by Radius Neighbors method?

View Answer

It looks in the vincinity of area, covered by a fixed radius, of each training point.

27. Which of the following module of sklearn is used to deal with Nearest Neighbors?

View Answer

neighbors

28. A feature can be reused to split a tree during Decision tree creation.

View Answer

True

29. Which of the following parameter is used to tune a Decision Tree?

View Answer

max_depth

30. Which of the following module of sklearn is used for dealing with Decision Trees?

View Answer

tree

31. A small change in data features may change a Decision Tree completely.

View Answer

True

32. Which of the following utility is used for regression using decision trees?

View Answer

DecisionTreeRegressor

33. Decision trees overfit the data very easily.

View Answer

True

34. Ensemble methods are better than Decision Trees.

View Answer

True

35. More improvement is found in an ensemble when base estimators are highly correlated?

View Answer

False

36. Which parameter is used to manage many base estimators in RandomForestClassifier?

View Answer

n_estimators

37. Which of the following module of sklearn is used for dealing with ensemble methods?

View Answer

ensemble

38. Which of the following utility of sklearn.ensemble is used for implementing classification with the bagging method?

View Answer

BaggingClassifier

39. Which of the following utilities are provided by sklearn to perform classification using support vector machines?

View Answer

All the options

40. What values can be used for kernel parameter of SVC class?

View Answer

All the options

41. Scaling or Normalization of data improves the accuracy of support vector machines.

View Answer

True

42. LinearSVC class accepts kernel parameter value.

View Answer

False

43. Which attribute provides details of obtained support vectors, after classifying data using SVC?

View Answer

support_vectors_

44. Which approach is used by SVC and NuSVC for multi-class classification?

View Answer

one vs one

45. What happens when very small value is used for parameter C in support vector machines?

View Answer

None

46. Which of the following parameter of SVC method is used for fine-tuning the model?

View Answer

C

47. Which of the following module of sklearn provides the utilities to deal with support vector machines?

View Answer

svm

48. Which of the following utility of sklearn.cluster is used for performing k-means clustering?

View Answer

KMeans()

49. Agglomerative Clustering follows a top-down approach.

View Answer

False

50. Which of the following parameters are used to control Density-based clustering?

View Answer

eps, min_samples

51. What does the Homogeneity score of a clsutering algorithm indicate ?

View Answer

Verifies if each cluster contains only members of a single class.

52. Which of the following clustering technique is used to group data points into user given k clusters?

View Answer

K-means clustering

53. Spectral Clustering is best suited for identifying dense clusters.

View Answer

Pages

Search Your Question...!

Machine Learning Using scikit-learn MCQ's

1. Which of the following utility of Pandas can be used to read from Oracle database?

read_sql

2. What do the methods starting with fetch, of sklearn.datasets module do?

It downloads a specific dataset from a library.

3. Which of the following function is used for loading famous iris dataset from sklearn.datasets?

load_iris()

4. Which of the following Python library is used for Machine Learning?

Scikit-Learn

5. Which of the following module of sklearn contains popular datasets, which are processed?

datasets

6. What is the type of iris variable, shown in the below expression?

from sklearn import datasetsiris = datasets.load_iris() sklearn.datasets.base.Bunch

7. What value does the attribute DESCR of a specific loaded dataset contain?

Contains a description of the loaded dataset

8. scikit-learn provides utilities for building artificial datasets.

True

9. Which of the following library is widely used to read data from external sources with structured data?

Pandas

10. Which of the following expressions can access the features of the iris dataset, shown in the below expression?

from sklearn import datasets iris = datasets.load_iris() iris.data(doubt)

11. Scikit-learn provides Pipeline utility to build a pipeline, which performs a series of transformations.

True

12. The preprocessing technique in which categorical values are transformed to categorical integers is known as ________________.

Encoding

13. The preprocessing technique in which missing values are replaced with the mean of a dataset is known as _______________.

Imputing

14. The preprocessing technique in which a dataset is transformed to a distribution of mean 0 and variance 1 is known as __________________.

Mean removal

15. Which of the following API is used to normalize a sample to the unit norm?

Normalize

16. Which of the following API is used to scale a dataset to range 0 and 1?

MinMaxScaler

17. import sklearn.preprocessing as preprocessing regions = ['HYD', 'CHN', 'MUM', 'HYD', 'KOL', 'CHN'] print(preprocessing.LabelEncoder().fit(regions).transform(regions))

[1 0 3 1 2 0]

18. Which of the following module of sklearn contains preprocessing utilities?

Preprocessing

19. ________ parameter is used to control the number of neighbors of KNearestClassifier.

n_neighbors

20. Which regressor utility of sklearn.neighbors is used to learn from k nearest neighbors of each query point?

KNeighborsRegressor

21. Which of the following parameter can be used to give more weightage to the points, which are nearer to a point in the nearest neighbors method?

weights

22. Which of the following class is used to implement the K-Nearest Neighbors classification in scikit-learn?

KNeighborsClassifier

23. Which of the following algorithms can be used with any nearest neighbors utility in scikit-learn?

all

24. Which of the following is an essential parameter of RadiusNeighborsClassifier?

radius

25. Neighbors-based regression is mainly used when the data labels are continuous rather than discrete variables.

True

26. What is the strategy followed by Radius Neighbors method?

It looks in the vincinity of area, covered by a fixed radius, of each training point.

27. Which of the following module of sklearn is used to deal with Nearest Neighbors?

neighbors

28. A feature can be reused to split a tree during Decision tree creation.

True

29. Which of the following parameter is used to tune a Decision Tree?

max_depth

30. Which of the following module of sklearn is used for dealing with Decision Trees?

tree

31. A small change in data features may change a Decision Tree completely.

True

32. Which of the following utility is used for regression using decision trees?

DecisionTreeRegressor

33. Decision trees overfit the data very easily.

True

34. Ensemble methods are better than Decision Trees.

True

35. More improvement is found in an ensemble when base estimators are highly correlated?

False

36. Which parameter is used to manage many base estimators in RandomForestClassifier?

n_estimators

37. Which of the following module of sklearn is used for dealing with ensemble methods?

ensemble

38. Which of the following utility of sklearn.ensemble is used for implementing classification with the bagging method?

BaggingClassifier

39. Which of the following utilities are provided by sklearn to perform classification using support vector machines?

from sklearn import datasets
iris = datasets.load_iris()
sklearn.datasets.base.Bunch

from sklearn import datasets
iris = datasets.load_iris()
iris.data(doubt)

17. import sklearn.preprocessing as preprocessing
regions = ['HYD', 'CHN', 'MUM', 'HYD', 'KOL', 'CHN']
print(preprocessing.LabelEncoder().fit(regions).transform(regions))