Overview Of Standard Deviation in RStandard deviation in R is a statistic that measures the amount of dispersion or variation of a set of value, generally, it is used when we are dealing with values where we have to find the difference between the values and the mean. Mathematical formula of standard deviation: Start Your Free Data Science Course Hadoop, Data Science, Statistics & others Where,
How to Calculate Standard Deviation?Steps to calculate Standard deviation are:
You will get the standard deviation as a result after completing 4 steps. Examples with Steps of Standard DeviationLets take an example and follow these steps. Example #1Data set looks like, 4,8,9,4,7,5,2,3,6,8,1,8,2,6,9,4,7,4,8,2 Step 1: Calculate the mean of all the observations,
Step 2: For each observation, subtract the mean, we will put it in the tabular form for the convenience, Double the value of the column second (Observation Mean)^2. Step 3: Summation of all the values present in the above column. 1.8225 + 7.0225 + 13.3225 + 1.8225 + 2.7225 + 0.1225 + 11.2225 + 5.5225 + 0.4225 + 7.0225 + 18.9225 + 7.0225 + 11.2225 + 0.4225+13.3225 + 1.8225 + 2.7225 + 1.8225 + 7.0225 + 11.2225 = 126.55 Popular Course in this category R Programming Training (12 Courses, 20+ Projects)12 Online Courses | 20 Hands-on Projects | 116+ Hours | Verifiable Certificate of Completion | Lifetime Access 4.5 (7,737 ratings) Course Price View Course Related Courses Statistical Analysis Training (10 Courses, 5+ Projects)All in One Data Science Bundle (360+ Courses, 50+ projects)Step 4: We will calculate the Standard deviation, by dividing summation with the number of observations minus 1 and we will square root the result. Standard Deviation = (126.55/19)^0.5 = 2.58079 Example #2Now we will look into some other examples with different datasets. In this example, we have two columns. In one column there are some alphabetic codes which we assigned to the people and in the next column, we have the age of those sets of people. Step 1: We will upload the excel file in R. Here we will use read.csv function because our excel file is in csv format. Suppose this table is in excel, so how this will work in Rstudio, we will discuss this step by step. The name of the excel file is alphabetic code. The function for the same looks like, SD_age = read.csv("alphabetic code.csv") Step 2: calculating the standard deviation from the excel file. As we can see, that 2 column contains a numeric value. We will run our code on that column specifically, In R, the syntax for Standard Deviation looks like this: standard_deviation_age = sd(SD_age) The output of the codes provides us the Standard deviation of the dataset. The standard deviation of the Age is 15.52926. Methods of Standard Deviation in RThere are multiple methods to calculate Standard deviation in R. We will here discuss one long method and one very short method. 1. Long MethodThis method will incorporate the same steps which we did earlier in this article, the only difference now is we will use R commands. Step 1: Calculate the mean of all the observations. Code: dataset = c(4,8,9,4,7,5,2,3,6,8,1,8,2,6,9,4,7,4,8,2) Output: Step 2: For each observation, subtract the mean from all the observations of the dataset. For this, we will make a function in R, which will help us to find [Observation-Mean]. Code: dataset = c(4,8,9,4,7,5,2,3,6,8,1,8,2,6,9,4,7,4,8,2) Output: The output shows Observation Mean for all the values in our dataset. Now we will square each value of this output an do the summation. Code: dataset = c(4,8,9,4,7,5,2,3,6,8,1,8,2,6,9,4,7,4,8,2)> Output: Step 3: Summation of all the values present in the above column. Now we will add all these [(Observation Mean)^2]. Code: dataset = c(4,8,9,4,7,5,2,3,6,8,1,8,2,6,9,4,7,4,8,2) Output: Step 4: We will calculate the Standard deviation. No, we will put all the necessary information which we derive in all the above steps into this function:
In R, the syntax for Standard Deviation looks like this: Code: dataset = c(4,8,9,4,7,5,2,3,6,8,1,8,2,6,9,4,7,4,8,2) Output: Hence we can see the Standard deviation is the same which we got earlier. 2. Short MethodSyntax in R for the direct method looks like, sd(x, na.rm = FALSE) Where sd is Standard deviation. x is those set values for which we need to find the standard deviation. na.rm, if it is true then it will remove all the missing value from the dataset/ matrix /data frames etc. And if it is false, then it wont remove missing value from the data set. Code: dataset = c(4,8,9,4,7,5,2,3,6,8,1,8,2,6,9,4,7,4,8,2) Output: ConclusionStandard deviation tells us how much our observations in the datasets are spread out from the actual mean. Significance of low and high standard deviation is:
Recommended ArticlesThis is a guide to Standard Deviation in R. Here we discuss the steps and methods of Standard Deviation in R along with examples and code implementation. You may also have a look at the following articles to learn more
R Programming Training (12 Courses, 20+ Projects) 12 Online Courses 20 Hands-on Projects 116+ Hours Verifiable Certificate of Completion Lifetime Access Learn More 0 Shares Share Tweet Share Video |