garbage_variable = { d3.select("#normal-density").selectAll("*").remove()var margin = {top:40,right:30,bottom:50,left:100}, width =530- margin.left- margin.right, height =400- margin.top- margin.bottom;// append the svg object to the body of the pagevar svg = d3.select("#normal-density").append("svg").attr("width", width + margin.left+ margin.right).attr("height", height + margin.top+ margin.bottom).append("g").attr("transform","translate("+ margin.left+","+ margin.top+")");const data_x =range(-10,10,0.01);const data_y_std =normal_pdf(range(-10,10,0.01),0,1);const data_y =normal_pdf(range(-10,10,0.01), mu, sigma);const data = data_x.map((value, index) => {return {'x': value,'y_std': data_y_std[index],'y': data_y[index]}; });// Now I can use this dataset:// Add X axis --> it is a date formatvar x = d3.scaleLinear().domain([d3.min(data,function(d) { return+d.x; }), d3.max(data,function(d) { return+d.x; })]).range([ 0, width ]); svg.append("g").attr("transform","translate(0,"+ height +")").call(d3.axisBottom(x));// Add Y axisvar y = d3.scaleLinear().domain([0, d3.max(data,function(d) { return+Math.max(d.y, d.y_std); })]).range([ height,0 ]); svg.append("g").call(d3.axisLeft(y));// Add the line svg.append("path").datum(data).attr("fill","none").attr("stroke","steelblue").attr("stroke-width",1.5).attr("d", d3.line().x(function(d) { returnx(d.x) }).y(function(d) { returny(d.y_std) }) ); svg.append("path").datum(data).attr("fill","none").attr("stroke","red").attr("stroke-width",1.5).attr("d", d3.line().x(function(d) { returnx(d.x) }).y(function(d) { returny(d.y) }) ); svg.selectAll('text') .style('font-size','14px'); svg.append("text").attr("class","y label").attr("text-anchor","middle").attr("y",x(-14)).attr("x",-y(0)/2).attr("dy",".75em").attr("transform","rotate(-90)").style('font-size','24px').text("Density"); svg.append("text").attr("class","x label").attr("text-anchor","middle").attr("x",x(0)).attr("y", height +45).style('font-size','24px').text("x"); svg.append("text").attr("x", width /2).attr("y",0).attr("text-anchor","middle").text("Normal Curve").attr("dy","-15px").style('font-size','32px').attr("class","plot-title");// create a list of keysvar keys = ['N(0, 1)','N('+3+' '+5+')'] svg.append("rect").attr("x",x(6)).attr("y",40).attr("width",20).attr("height",2).style("fill","steelblue") svg.append("text").attr("x",x(7.5)).attr("y",45).text("N(0, 1)").style("font-size","15px").attr("alignment-baseline","middle") svg.append("rect").attr("x",x(6)).attr("y",60).attr("width",20).attr("height",2).style("fill","red") svg.append("text").attr("x",x(7.5)).attr("y",65).text('N('+ mu +', '+ (sigma**2).toPrecision(2) +')').style("font-size","15px").attr("alignment-baseline","middle")};
Properties:
Bell-shaped and Unimodal;
Fully specified by two parameters, \(\mu\) and \(\sigma\):
\(\mu\) determines the location;
\(\sigma\) determines the spread;
Symmetric about the mean \(\mu\);
Areas under the Normal Model
The area under the Normal model tells us the probability that the corresponding variable is in a specified region.
We need to use computers to obtain the area under the normal model (there’s no analytical solution).
But, there’s a rule that can help us do a quick check of our calculations.
The 68-95-99.7% Rule
Scroll down
No matter what is the value of \(\mu\) and \(\sigma\) we have the following rule
Interval
% of data within the interval
within \(1\sigma\) of \(\mu\)
about \(68\%\)
within \(2\sigma\) of \(\mu\)
about \(95\%\)
within \(3\sigma\) of \(\mu\)
about \(99.7\%\)
This is an useful approximation for sanity check!
For actual solutions use R (or a table if you don’t have access to R).
R’s pnorm and qnorm functions
Scroll down
Probability:
To obtain the area under the curve, we use the pnorm function.
For example, suppose we have a \(N( \mu = 10, \sigma^2 = 3)\) and want the area below 11.5:
We can use the following code
pnorm( 11.5, mean =10, sd =sqrt(3))
[1] 0.8067619
Quantile:
To obtain the quantile of a Normal, we use the qnorm function.
For example, suppose we have a \(N( \mu = 10, \sigma^2 = 3)\) and want the 0.69-quantile:
We can use the following code
qnorm( 0.69, mean =10, sd =sqrt(3))
[1] 10.85884
Standard Normal
The \(Z\)-score of a variable coming from \(N(\mu, \sigma^2)\) follows the Standard Normal distribution, i.e., \(N(0, 1)\).
There are multiple ways to check for adequacy of the Normal model. A simple (and subjective) way is to check if the relative frequency histogram looks like a Normal curve.
Example 1: Housefly Wing Lengths
Sokal and Hunter (1955) studied the wing lengths of houseflies.
Example 2: Birthweight
In this case, we have a heavier left tail, which might compromise the Normal approximation.
Exercise 1
Scores on a standard IQ test for the 20 to 34 age group follow approximately the Normal model with mean \(\mu=110\) and standard deviation \(\sigma=25\).
What percentage of people aged 20 to 34 have IQ scores below 160?
What percentage have scores between 90 and 120?
How high is the IQ such that only 0.15% of the group fall above?
Exercise 2
A machine used to regulate the amount of dye dispensed for mixing shades of paint can be so that it discharges an average of \(\mu\) milliliters of dye per can of paint. The amount of dye discharged is known to follow the Normal model with a standard deviation of 0.4 milliliter. If more than 6 milliliters of dye are discharged when making a certain shade of blue paint, the shade is unacceptable. Determine the setting for the mean \(\mu\) such that only 2% of the cans of paint will be unacceptable.
References
Image Attributions
Fly Image Attribution: See page for author, CC BY 4.0, via Wikimedia Commons.
Sokal, Robert R., and Preston E. Hunter. 1955. “A Morphometric Analysis of Ddt-Resistant and Non-Resistant House Fly Strains1, 2.”Annals of the Entomological Society of America 48 (6): 499–507. https://doi.org/10.1093/aesa/48.6.499.