For example, there is a correlation between income and education. We find that people with higher income have more years of education. (You can also phrase it that people with more years of education have higher income.) When we know there is a correlation between two variables, we can make a prediction. If we know a group's income, we can predict their years of education.
In statistics, dependence is any statistical relationship between two random variables or two sets of data. Correlation refers to any of a broad class of statistical relationships involving dependence.
Familiar examples of dependent phenomena include the correlation between the physical statures of parents and their offspring, and the correlation between the demand for a product and its price. Correlations are useful because they can indicate a predictive relationship that can be exploited in practice. For example, an electrical utility may produce less power on a mild day based on the correlation between electricity demand and weather. In this example there is a causal relationship, because extreme weather causes people to use more electricity for heating or cooling; however, statistical dependence is not sufficient to demonstrate the presence of such a causal relationship (i.e., correlation does not imply causation).
In probability theory and statistics, the mathematical concepts of covariance and correlation are very similar. Both describe the degree to which two random variables or sets of random variables tend to deviate from their expected values in similar ways.
where E is the expected value operator and and are the standard deviations of X and Y, respectively. Notably, correlation is dimensionless while covariance is in units obtained by multiplying the units of the two variables. The covariance of a variable with itself (i.e. ) is called the variance and is more commonly denoted as the square of the standard deviation. The correlation of a variable with itself is always 1 (except in the degenerate case where the two variances are zero, in which case the correlation does not exist).
In probability theory, to say that two events are independent (alternatively called statistically independent or stochastically independent ) means that the occurrence of one does not affect the probability of the other. Similarly, two random variables are independent if the realization of one does not affect the probability distribution of the other.
The concept of independence extends to dealing with collections of more than two events or random variables. Statistics
In cryptography, correlation attacks are a class of known plaintext attacks for breaking stream ciphers whose keystream is generated by combining the output of several linear feedback shift registers (called LFSRs for the rest of this article) using a Boolean function. Correlation attacks exploit a statistical weakness that arises from a poor choice of the Boolean function – it is possible to select a function which avoids correlation attacks, so this type of cipher is not inherently insecure. It is simply essential to consider susceptibility to correlation attacks when designing stream ciphers of this type.]citation needed[