Great Deal! Get Instant \$10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

# This is a quiz assignment. I will send to you a quiz paper picture.at 4 pm on 14JUN, You will answer 16 questions immediately within 4 hours. Maybe you will finish for just one hour . Thank you.

This is a quiz assignment. I will send to you a quiz paper picture.at 4 pm on 14JUN, You will answer 16 questions immediately within 4 hours. Maybe you will finish for just one hour . Thank you.
Answered Same Day Jun 13, 2021

## Solution

Ishvina answered on Jun 15 2021
Solution 1.
The 2 types of data fallacies are :
1.Data dredging - It is repeatedly testing new hypotheses against the same set of data.
2.Overfitting â€“ It is creating a model that is overly tailored to the data we train it with and is not representative of the general trend.
Â· They can be identified during the data preparation process:
1. Data dredging - combinations of variables that might show aÂ co
elation and for groups of cases or observations that show differences in their mean or in their
eakdown by some other variable.
2. Overfitting â€“ It can be identified by checking validation metrics such as accuracy and loss.
Â· They could impact a model if left in the dataset :
1. Data dredging â€“ It dramatically increases and understands the risk of false positives.
2. Overfitting - Â It negatively impacts the performance of the model on new data which means that the noise or random fluctuations in the training data is picked up and learned as concepts by the model. The problem is that these concepts do not apply to new data and negatively impact the models ability to generalize.
Solution 2
Â· What does it mean to have tidy data and how does it relate to the analytics base table (ABT)?
Having tidy data means data setsÂ that are a
anged such that each variable is a column and each observation is a row.
ABT is a flat table that is used for building analytical models and scoring the future behavior of a subject.A single record in this table represents the subject of theÂ p
ediction and stores all data (variables) describing this subject.
Â· Why is are the data types in the ABT so important for modeling?
->The data type of the feature restricts us to deciide which models can be used .
Â· How does each modeling method view the ABT/feature matrix?
Â· Simulation
Â· E
or-based
Â· Probability-based
Â· Similarity-based
Â· Information-based
Â· Time Series
It varies according to the type of feature
For time series- features are numeric, time ordered
For information type â€“ features can be numerical or categorical
For Similarity-based â€“ features can be label or target labels
For probability based - features can be categorical tytpe
For e
or based â€“ numeric
For simulaton - numeric
Solution 3.
Â· What is simulation modeling?
Simulation modeling is an algorithmic modeling method which propagates states of a system through time. It is suitable for describing systems that are difficult to capture or analyze with standard analytical mathematical or experimental methods.
Simulation modeling is of three types :
1. Deterministic for discrete systems
2. Stochastic
3. Dynamic
Â· Give an example of a simulation modelling problem.
An example of simulation modeling problem is Choosing between two stock portfolios- portfolio simulation.
Â· How does simulation modeling differ from machine learning?
Simulation modeling is easy to explain as it follows bottom up approach it only requires setting up the rules and starting states. With simulation, the random variable inputs aren't known exactly, but the model is often known exactly. With machine learning, the inputs are known exactly, but the model is unknown prior to training.
All of machine learning revolves around optimization.
Solution 4.
Â· Describe how e
or-based modeling methods work in general.
E
or-based methods like regression are concerned with modeling the relationship between variables both...
SOLUTION.PDF