Perform the analysis in the space below. Remember to follow the instructions and review the project rubric before submitting. Once you've completed the analysis and write-up, download this file as a PDF or HTML file, upload that PDF/HTML into the workspace here (click on the orange Jupyter icon in the upper left then Upload), then use the Submit Project button at the bottom of this page. This will create a zip file containing both this .ipynb doc and the PDF/HTML doc that will be submitted for your project.
(1) What is the independent variable? What is the dependent variable?
The independeant variable is the color of ink which is used to print the word. The dependent variable is the time which is taken by participant to speak the name of color.
(2) What is an appropriate set of hypotheses for this task? Specify your null and alternative hypotheses, and clearly define any notation used. Justify your choices.
Null Hypothesis ( $H_{0}:\mu_{0}=\mu_{1}$)
Alternative Hypothesis ( $H_{1}:\mu_{0}\ne\mu_{1}$)
Statistical Test : t-test will be used
(3) Report some descriptive statistics regarding this dataset. Include at least one measure of central tendency and at least one measure of variability. The name of the data file is 'stroopdata.csv'.
# Perform the analysis here
import pandas as pd
df = pd.read_csv('stroopdata.csv')
df.describe()
As centeral tendency, mean, is like below. Congruent mean = 8.630000 Incongruent mean = 15.687000
As a measureof variability, standard deviation is like below. Congruent stdev is 3.559358 Incongruent stdev is 4.797057
(4) Provide one or two visualizations that show the distribution of the sample data. Write one or two sentences noting what you observe about the plot or plots.
# Build the visualizations here
import matplotlib.pyplot as plt
%matplotlib inline
df.plot(kind='box')
My observation is that average of incongruent is higher then average of congurent
df.plot(kind='hist', alpha=0.5)
mode of incongruent is bigger than mode of congruent.
(5) Now, perform the statistical test and report your results. What is your confidence level or Type I error associated with your test? What is your conclusion regarding the hypotheses you set up? Did the results match up with your expectations? Hint: Think about what is being measured on each individual, and what statistic best captures how an individual reacts in each environment.
from scipy import stats
stats.ttest_rel(df['Congruent'], df['Incongruent'])
p-value is lower than 0.05, so null hypothesis is rejected.
(6) Optional: What do you think is responsible for the effects observed? Can you think of an alternative or similar task that would result in a similar effect? Some research about the problem will be helpful for thinking about these two questions!
I'll do it after I buy some time for it :)