[go: up one dir, main page]

0% found this document useful (0 votes)
8 views8 pages

Questions 1

NA

Uploaded by

kiruthikaakannan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views8 pages

Questions 1

NA

Uploaded by

kiruthikaakannan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Questions:

1. Skewness of a dataset measures:


a) Central Tendency
b) Spread of Data
c) Asymmetry of Distribution
d) Correlation Strength
Answer: c
2. Which Python library is best suited for kernel density estimation plots?
a) NumPy
b) Pandas
c) Seaborn
d) Matplotlib
Answer: c
3. ROC curve is used to measure:
a) Feature correlation
b) Classifier performance
c) Clustering efficiency
d) Regression error
Answer: b
4. What is the output of the following code?

x = [1, 2, 3]
y = x
y += [4, 5]
print(x)
a) [1, 2, 3]
b) [1, 2, 3, 4, 5]
c) Error
d) [4, 5]
Answer: b
5. What is "feature engineering"?

a) Testing the model


b) Creating new input variables from existing data
c) Building a user interface
d) Deploying the model
Answer: b

6. Which visualization technique is most suitable for outlier detection?


a) Histogram
b) Box Plot
c) Heatmap
d) Scatter Matrix
Answer: b

7. Cross-validation is used primarily to:


a) Reduce bias
b) Prevent overfitting
c) Increase dataset size
d) Normalize data
Answer: b

8. Which of the following is an unsupervised learning model?


a) Linear Regression
b) KNN
c) PCA
d) Logistic Regression
Answer: c

9. A/B testing is an example of:


a) Parametric Regression
b) Hypothesis Testing
c) Clustering
d) Feature Engineering
Answer: b
10. Kurtosis is used to measure:
a) Skewness of distribution
b) Shape/Peakedness of distribution
c) Central tendency
d) Feature correlation
Answer: b

11. In classification, the contingency table is also known as:


a) Covariance Matrix
b) Confusion Matrix
c) PCA Matrix
d) Regression Table
Answer: b

12. Which of the following distributions has memoryless property?


a) Normal
b) Exponential
c) Poisson
d) Binomial
Answer: b

13. The distribution of the sample variance of a normally distributed population


follows:
a) Normal Distribution
b) Chi-Square Distribution
c) t Distribution
d) F Distribution
Answer: b

14. In hypothesis testing, Type II error occurs when:


a) Rejecting H₀ when H₀ is true
b) Accepting H₀ when H₀ is false
c) Rejecting H₁ when H₁ is false
d) Rejecting both H₀ and H₁
Answer: b
15. The distribution of the ratio of two independent chi-squares divided by their degrees
of freedom is:
a) t Distribution
b) F Distribution
c) Normal Distribution
d) Exponential Distribution
Answer: b

16. Which test is suitable for testing independence of two categorical variables?
a) Z test
b) F test
c) Chi-Square test
d) t test
Answer: c

17. In NoSQL, a document-oriented database stores data in:


a) Tables and columns
b) Rows and joins
c) JSON-like key-value pairs
d) Graphs
Answer: c

18. ACID property 'Durability' ensures:


a) Transaction changes are visible to others immediately
b) Changes are permanent even after system crash
c) All concurrent transactions are serialized
d) Intermediate results are stored
Answer: b

19. In MongoDB, data is stored in:


a) Tables
b) Collections of JSON-like documents
c) Flat files
d) Tuples and relations
Answer: b
20. What is the default "_id" field in MongoDB used for?
a) Indexing only text fields
b) Unique primary key for each document
c) Foreign key linking
d) Reference to external file
Answer: b

21. What does db.collection.find({}).pretty() do?


a) Finds documents and outputs raw JSON
b) Inserts pretty-printed JSON
c) Displays documents in readable format
d) Lists only schema
Answer: c

22. To retrieve documents where field "age" > 25, which query is correct?
a) db.users.find({"age": ">25"})
b) db.users.find({age: {$gt: 25}})
c) db.users.find({age: $greater(25)})
d) db.users.get({age > 25})
Answer: b

23. Cassandra follows which type of database architecture?


a) Master-slave
b) Peer-to-peer
c) Hierarchical
d) Client-server
Answer: b

24. In Cassandra, a tombstone is:


a) A pointer to last write timestamp
b) A deleted row marker
c) A snapshot
d) A replica heartbeat
Answer: b
25. What will be the output of the expression type of (5L) in R?
a) "double"
b) "numeric"
c) "integer"
d) "character"
Answer: c

26. Which of the following is a major task in data mining?


a) Data Normalization
b) Data Indexing
c) Pattern Discovery
d) Query Processing
Answer: c

27. In k-means clustering, which of the following is a limitation?


a) Handles non-numeric data well
b) Finds global optimum
c) Sensitive to initial centroids
d) Suitable for categorical data
Answer: c

28. What is the key difference between classification and clustering?


a) Classification uses labeled data; clustering doesn’t
b) Clustering uses supervised learning
c) Classification identifies structure
d) Clustering uses dependent variables
Answer: a

29. The major drawback of Apriori algorithm is:


a) Cannot find association rules
b) Requires numeric data only
c) Expensive candidate generation
d) Only works for continuous data
Answer: c
30. Entropy in a dataset refers to:
a) Number of classes
b) Purity of the data
c) Error rate
d) Margin of classifier
Answer: b

31. What is the output of the expression list(range(2, 10, 2))?


a) [2, 3, 4, 5, 6, 7, 8, 9]
b) [2, 4, 6, 8]
c) [2, 6, 10]
d) [2, 5, 8]
Answer: b

32. Which of the following Python libraries is best suited for handling labeled data and
time series?
a) NumPy
b) Pandas
c) Matplotlib
d) Scikit-learn
Answer: b

33. Which NumPy function returns the standard deviation of an array?


a) stddev()
b) sd()
c) std()
d) var()
Answer: c

34. Which of the following libraries is used for machine learning in Python?
a) NumPy
b) Matplotlib
c) Seaborn
d) Scikit-learn
Answer: d
35. Which method is used to detect null values in pandas?
a) isna()
b) notnull()
c) isnan()
d) isnull()
Answer: d

36. What is the purpose of ACID properties in databases?


a) Speed
b) Backup
c) Transaction reliability
d) Schema design
Answer: c

You might also like