27 Jan 2010

Sample Essay: Prediction of Business Pattern

Data mining is the process of analyzing or manipulating the data from different angles and perspectives (Data Mining: What is Data Mining?, reviewed on 16 March 2009, http://www. anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm). By analyzing the data we obtain some useful information that can then be categorized and used to predict future patterns. Consumer focused companies use data mining to determine the impact of the internal or external factors of products on the sales and the profits. Data mining gives a company useful information that can be used to predict the patterns of sales, based on the past purchase patterns of the customers and also helps plan the future promotional activities. So it can be said that major goal of the data mining is prediction of business pattern.

For example, a shop keeper collects data on purchase pattern of some customers based on features like, the age and gender of the buyer, the frequency of purchase, the occasions of purchases.  He can then utilize such personal knowledge for bringing in innovation in the promotional strategies to increase sales (a reflective practice, i.e. the outcome is changed by changing the reflex to the stimulus).

Data mining is being used for several decades by many companies which possessed a tacit knowledge of the process. Then it was known in the name of the knowledge management. The benefit those companies derived from such process was huge. When they started using the statistical methods, accuracy of such predictions increased. Relationships between certain inputs with that of certain outcomes were derived using statistical methods. The advent of computers followed by major software developments has helped the companies turn to electronic data mining. This has increased the speed and accuracy of the predictions further. Today, custom made softwares are available for analyzing the data collected by the companies. These soft wares also help the companies in warehousing the data.

Today, the data mining has become an important knowledge management (KM) model in the corporate structure and it plays a major role in the decision making process of the company leadership. For that purpose, most of the data mining processes use the basic statistical principle of Exploratory Data Analysis (EDA). And, data mining, to some extend, makes use of Artificial Intelligence (AI) and the data base research technologies in the processing of the information.

The data mining process

In earlier days, the whole process was done manually and scope of the pattern prediction was very limited. But now, with the help of computers, the raw data is retrieved from the centralised data storage unit (warehouse) and using custom made software, they are categorized. The categorized information is then summarized to get the knowledge about a particular feature which is then used to predict future patterns.

The process mentioned above may seem to categorize data mining as knowledge management process. But, there are more complex activities involved in data mining. The initial process is preparing the data for further classification and analysis. The preparation stage involves the steps mentioned below.

(a)  Data preparation: The term denotes the process of omitting unwanted or unrealistic data (such as out of range values, like the negative value for age). This is an important process as the accuracy of the prediction hinges on the accuracy of the raw data.

(b) Feature selection: The process of identifying and selecting the prediction-related feature from the total available features.

(c) Data reduction: Large volume of data is tabulated and aggregated to reduce its volume for further, easy manipulation.

(d) Deployment:  The process of employing suitable statistical model over the information for classification, codification and pattern prediction.

(Data mining techniques, reviewed 16 Mar 09, http://www.statsoft.nl/uk/textbook/stdatmin.html)

Types of relationships analyzed in data mining

As part of the process, the data mining seeks to ascertain four types of relationship among the data being mined. The relationships are as mentioned below.

(a) Class: refers to classification of different groups for different features

(b) Cluster: grouping of two or more features which have certain relationship between them

(c) Association: determines association like the combined purchase of two different types of products

(d) Sequential pattern: determines if sale of one product is related to sale of another product.

(Data Mining: What is Data Mining?, reviewed on 16 March 2009, http://www.anderson. ucla.edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm)

General analyzing methods used in data mining

There are various innovative methods are being used in data mining these days. Still, as mentioned earlier, EDA (Exploratory Data Analysis) is the major statistics principle employed in data mining. Following are some of those EDA methods used in data mining.

(a) Artificial neural network: resembles the neural network of the human body and is non-linear in nature (this can be compared to the mental models practiced in psychology for behavior analysis)

(b) Genetic algorithms: based on the method of natural selection and natural evolution

(Doug Alexander, Data Mining, reviewed on 16 March 2009, http://www.eco.utexas.edu/~norman/BUS.FOR/course.mat/Alex/).

(c) Decision tree models such as CART (Classification and Regression Tree), and CHAID (Chi Square Automatic Interaction Detection). CART is two dimensional and CHAID is multi dimensional. These are comparatively newer methods.

(d) Nearest neighbor method:  classifies each record of a warehoused data sheet with reference to a combination of classes that is part of historical data sheet. For similar class of records, similar prediction (given in the historical data sheet) can be made.

(e) If-then rule induction model: this method uses the principles of cause and effect for making pattern predictions.

(f) Data Visualization: One of the most effective methods. In this method, the data is interpreted using graphic tools by the help of computer software. It has got much potential in the future of data mining.

(Data Mining: What is Data Mining? reviewed on 16 March 2009, Ошибка! Недопустимый объект гиперссылки.).

Recent developments

Above mentioned facts are general in nature. In recent decades, there have been rapid changes in the comprehensive aspects of data mining such as, (a) the type of data being mined (e.g. aspects / field like relational, transactional, multi-media and ontology etc),  (b) the knowledge derived by mining (e.g. relationship like association, class, trend, discrimination, multiple and integrated functions etc), (c) the techniques developed (e.g. OLAP analysis, scalable data mining, spatio-temporal data mining etc), and, (d) the applications of data mining (e.g. retail market, banking, credit cards, fraud detection etc) (Longbing Cao & Chengqi Zhang 2008 Domain Driven Data Mining, Data Mining and Knowledge Discovery Technologies, D. Taniar, IGI Global)

Utility of data mining

Data mining has tremendous potential in business field today, as every company depends heavily on the knowledge management system to stay put in the competition. The companies like the Wal-Mart has achieved its mammoth success because they could provide their suppliers with up-to-date data from their large, networked data warehouse which is analyzed by the suppliers with their own custom made soft wares. By this way the suppliers could find out the purchasing trend at any given time at any given outlet and do promotional activities accordingly (Data Mining: What is Data Mining? reviewed on 16 March 2009, http://www.anderson.ucla.edu/faculty/jason.

frand/teacher/technologies/palace/datamining.htm). This type of real time data mining is also known as On-Line Analytic Processing (OLAP). OLAP has created a win-win situation for the company as well as for the suppliers. As an offshoot, even some NGO’s are making use of the data mining these days to compile and analyze the data on social capital, complex adaptive systems of society, community of practice etc and use such information for whistle blowing the HR malpractices of big companies.

Intra disciplinary utility

Data mining has also got some intra-disciplinary utilities, which in turn help the statisticians develop more accurate method for data mining. Some of the intra-disciplinary models are,

(a) Mining programme Grid Minor Assistant (uses grid computing for ontology based framework and is automated. Used in scientific discovery, optimal treatment of patients, cutting costs etc. (Peter Brezany, Ivan Janciak & A Min Tjoa (2208) Ontology-Based Construction of Grid Data Mining Workflows.IGI Global.

(b) Intra disciplinary methods named multiple criteria optimization methods (e.g. multiple criteria linear, quadratic and fuzzy-linear programming) are used for statistical analysis in pharmaceutical industry (e.g. classification of HIV-1 associated dementia), finance industry (e.g. scoring management), and banks (e.g. credit card, fraud detection) etc (Yong Shi, Yi Peng, Gang Kou & Zhengxin Chen (2008) Introduction to Data    Mining Techniques via Multiple Criteria Optimization Approaches and Applications. IGI Global).

Infrastructure requirements

Modern data mining techniques exploit the computing capabilities of the company. The selection on processing capacity of the computer is determined based on the number of queries, complexity level of the analysis and volume of the data to be synthesized. Means, more the number of queries, or more the complexity or volume of data, more the processing capacity is required. The security level of data is a deciding factor in choosing the type of hardware. Companies also use server-client networking to warehouse the data for their capacity development.

Related issues

Though the business community the world over has immensely benefitted from the data mining process, it has raised some doubts on culture and corporate ethics. Some individuals oppose the use of explicit knowledge of their purchase behavior and the codification or classification of such knowledge for company’s profit making. Still, majority of the population does not consider it harmful and the companies seem to be bothered only about the cost and profit.


Data mining is the process of deriving the meaning and patterns from the collected data to predict a future pattern. Thus we can safely assume that, in a broader sense data mining is nothing but knowledge management. Similar process is followed in both: the raw data is converted in to information by analyzing the relationships among the data and this information is used to derive some knowledge by analyzing the patterns between various pieces of information and finally, this knowledge creates the wisdom (knowledge management) or predict the future patterns (data mining).


Longbing Cao & Chengqi Zhang,(2008) Domain Driven Data Mining.IGI Global.

Peter Brezany, Ivan Janciak & A Min Tjoa (2208) Ontology-Based Construction of Grid Data Mining Workflows.IGI Global.

Yong Shi, Yi Peng, Gang Kou & Zhengxin Chen (2008) Introduction to Data Mining Technique via Multiple Criteria Optimization Approaches and Applications. IGI Global.

UCLA Anderson School of Management website. Harvard Referencing Style: http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm

The University of Texas at Austin website Harvard Referencing Style:                                http://www.eco.utexas.edu/~norman/BUS.FOR/course.mat/Alex

Statsoft.nl website. Harvard referencing styl: http://www.statsoft.nl/uk/textbook/stdatmin.html

25 Aug 2009

Sample Essay: Activity 2. A Correlational Coefficient

In probability theory and statistics, correlation, (often measured as a correlation coefficient), indicates the strength and direction of a linear relationship between two variables. In correlation research we do not (or at least try not to) influence any variables but only measure them and look for relations (correlations) between some set of variables, such as blood pressure and cholesterol level.

In the scientific method, an experiment is a set of observations performed in the context of solving a particular problem or question, to retain or falsify a hypothesis or research concerning phenomena. The experiment is a cornerstone in the empirical approach to acquiring deeper knowledge about the physical world. Experimental research is also called randomized controlled research or randomized controlled trials. In experimental research, we manipulate some variables and then measure the effects of this manipulation on other variables; for example, a researcher might artificially increase blood pressure and then record cholesterol level. Data analysis in experimental research also comes down to calculating “correlations” between variables, specifically, those manipulated and those affected by the manipulation. However, experimental data may potentially provide qualitatively better information: Only experimental data can conclusively demonstrate causal relations between variables. For example, if we found that whenever we change variable A then variable B changes, then we can conclude that “A influences B.” Data from correlation research can only be “interpreted” in causal terms based on some theories that we have, but correlation data cannot conclusively prove causality. Source: http://www.statsoft.com/textbook/stathome.html

A correlation or correlational coefficient is used to represent the relationship between two variables and  is often abbreviated with the letter ‘r.’  A correlational coefficient typically ranges between -1.0 and +1.0 and provides two important pieces of information regarding the relationship: Intensity and Direction. ‘r’ scores range from -1 to +1

r= -1, perfect negative relation
example of a negative r:  drinking in college and GPA

r= 0, no relation
example of a near zero r:   hair length and GPA

r= +1, perfect positive relation
example of a positive r: GPA and scores on SAT

This in not done by behavioural surveys but based on physical experiments. Whereas, correlation research is performed in behavioural issues for which every thing cannot be measured in exact figures.

Suitability of correlation method:

The correlation is suitable when researchers want to know a way to measure how associated or related two variables are. The researcher looks at things that already exist and determines if and in what way those things are related to each other. Things may be related positively or negatively. The purpose of doing correlations is to allow us to make a prediction about one variable based on what we know about another variable.   Correlation is not causation. It doesn’t define cause and effect of the relation between two variables. Such type of research are used in education, employment etc.

For example, there is a correlation between income and education. We find that people with higher income have more years of education. (You can also phrase it that people with more years of education have higher income.) When we know there is a correlation between two variables, we can make a prediction. If we know a group’s income, we can predict their years

Suitability of experimental method:

The experiment is appropriate as the method to use to demonstrate a cause and effect relationship between two variables. When researchers want to know about the cause of a behaviour or mental process, they should do an experiment. In an experiment, the researcher manipulates or changes the environment in a controlled way, then measures the effect of that manipulation.

For example, it is through experiments that we know that drinking alcohol causes slower reaction times. The experimenter can give a set amount of alcohol to a group of participants, and then measure their reaction times. If their time slows down after drinking the alcohol, we know the alcohol caused that effect.

Accuracy, Validity and significance of correlation method:

As mentioned before, the correlation coefficient (r) represents the linear relationship between two variables. If the correlation coefficient is squared, then the resulting value (r2, the coefficient of determination) will represent the proportion of common variation in the two variables (i.e., the “strength” or “magnitude” of the relationship). In order to evaluate the correlation between variables, it is important to know this “magnitude” or “strength” as well as the significance of the correlation.

“Correlations” Significance of Correlations. The significance level calculated for each correlation is a primary source of information about the reliability of the correlation. As explained before (see Elementary Concepts), the significance of a correlation coefficient of a particular magnitude will change depending on the size of the sample from which it was computed. The test of significance is based on the assumption that the distribution of the residual values (i.e., the deviations from the regression line) for the dependent variable y follows the normal distribution, and that the variability of the residual values is the same for all values of the independent variable x. However, Monte Carlo studies suggest that meeting those assumptions closely is not absolutely crucial if your sample size is not very small and when the departure from normality is not very large. It is impossible to formulate precise recommendations based on those Monte- Carlo results, but many researchers follow a rule of thumb that if your sample size is 50 or more then serious biases are unlikely, and if your sample size is over 100 then you should not be concerned at all with the normality assumptions. There are, however, much more common and serious threats to the validity of information that a correlation coefficient can provide; they are briefly discussed in the following paragraphs

Outliers are atypical (by definition), infrequent observations. Because of the way in which the regression line is determined (especially the fact that it is based on minimizing not the sum of simple distances but the sum of squares of distances of data points from the line), outliers have a profound influence on the slope of the regression line and consequently on the value of the correlation coefficient. A single outlier is capable of considerably changing the slope of the regression line and, consequently, the value of the correlation, as demonstrated in the following example. Note, that as shown on that illustration, just one outlier can be entirely responsible for a high value of the correlation that otherwise (without the outlier) would be close to zero. Needless to say, one should never base important conclusions on the value of the correlation coefficient alone (i.e., examining the respective scatterplot is always recommended).

Note that if the sample size is relatively small, then including or excluding specific data points that are not as clearly “outliers” as the one shown in the previous example may have a profound influence on the regression line (and the correlation coefficient). This is illustrated in the following example where we call the points being excluded “outliers;” one may argue, however, that they are not outliers but rather extreme values.

Typically, we believe that outliers represent a random error that we would like to be able to control. Unfortunately, there is no widely accepted method to remove outliers automatically (however, see the next paragraph), thus what we are left with is to identify any outliers by examining a scatterplot of each important correlation. Needless to say, outliers may not only artificially increase the value of a correlation coefficient, but they can also decrease the value of a “legitimate” correlation.

See also Confidence Ellipse.

“Correlations” Quantitative Approach to Outliers. Some researchers use quantitative methods to exclude outliers. For example, they exclude observations that are outside the range of ±2 standard deviations (or even ±1.5 sd’s) around the group or design cell mean. In some areas of research, such “cleaning” of the data is absolutely necessary. For example, in cognitive psychology research on reaction times, even if almost all scores in an experiment are in the range of 300-700 milliseconds, just a few “distracted reactions” of 10-15 seconds will completely change the overall picture. Unfortunately, defining an outlier is subjective (as it should be), and the decisions concerning how to identify them must be made on an individual basis (taking into account specific experimental paradigms and/or “accepted practice” and general research experience in the respective area). It should also be noted that in some rare cases, the relative frequency of outliers across a number of groups or cells of a design can be subjected to analysis and provide interpretable results. For example, outliers could be indicative of the occurrence of a phenomenon that is qualitatively different than the typical pattern observed or expected in the sample, thus the relative frequency of outliers could provide evidence of a relative frequency of departure from the process or phenomenon that is typical for the majority of cases in a group. See also Confidence Ellipse.

“Correlations”Correlations in Non-homogeneous Groups. A lack of homogeneity in the sample from which a correlation was calculated can be another factor that biases the value of the correlation. Imagine a case where a correlation coefficient is calculated from data points which came from two different experimental groups but this fact is ignored when the correlation is calculated. Let us assume that the experimental manipulation in one of the groups increased the values of both correlated variables and thus the data from each group form a distinctive “cloud” in the scatterplot (as shown in the graph below).

In such cases, a high correlation may result that is entirely due to the arrangement of the two groups, but which does not represent the “true” relation between the two variables, which may practically be equal to 0 (as could be seen if we looked at each group separately, see the following graph).
If you suspect the influence of such a phenomenon on your correlations and know how to identify such “subsets” of data, try to run the correlations separately in each subset of observations. If you do not know how to identify the hypothetical subsets, try to examine the data with some exploratory multivariate techniques (e.g., Cluster Analysis).

“Correlations” Nonlinear Relations between Variables. Another potential source of problems with the linear (Pearson r) correlation is the shape of the relation. As mentioned before, Pearson r measures a relation between two variables only to the extent to which it is linear; deviations from linearity will increase the total sum of squared distances from the regression line even if they represent a “true” and very close relationship between two variables. The possibility of such non-linear relationships is another reason why examining scatterplots is a necessary step in evaluating every correlation. For example, the following graph demonstrates an extremely strong correlation between the two variables which is not well described by the linear function.

“Correlations” Measuring Nonlinear Relations. What do you do if a correlation is strong but clearly nonlinear (as concluded from examining scatterplots)? Unfortunately, there is no simple answer to this question, because there is no easy-to-use equivalent of Pearson r that is capable of handling nonlinear relations. If the curve is monotonous (continuously decreasing or increasing) you could try to transform one or both of the variables to remove the curvilinearity and then recalculate the correlation. For example, a typical transformation used in such cases is the logarithmic function which will “squeeze” together the values at one end of the range. Another option available if the relation is monotonous is to try a nonparametric correlation (e.g., Spearman R, see Nonparametrics and Distribution Fitting) which is sensitive only to the ordinal arrangement of values, thus, by definition, it ignores monotonous curvilinearity. However, nonparametric correlations are generally less sensitive and sometimes this method will not produce any gains. Unfortunately, the two most precise methods are not easy to use and require a good deal of “experimentation” with the data. Therefore you could:

Try to identify the specific function that best describes the curve. After a function has been found, you can test its “goodness-of-fit” to your data.

Alternatively, you could experiment with dividing one of the variables into a number of segments (e.g., 4 or 5) of an equal width, treat this new variable as a grouping variable and run an analysis of variance on the data.

“Correlations” Exploratory Examination of Correlation Matrices. A common first step of many data analyses that involve more than a very few variables is to run a correlation matrix of all variables and then examine it for expected (and unexpected) significant relations. When this is done, you need to be aware of the general nature of statistical significance (see Elementary Concepts); specifically, if you run many tests (in this case, many correlations), then significant results will be found “surprisingly often” due to pure chance. For example, by definition, a coefficient significant at the .05 level will occur by chance once in every 20 coefficients. There is no “automatic” way to weed out the “true” correlations. Thus, you should treat all results that were not predicted or planned with particular caution and look for their consistency with other results; ultimately, though, the most conclusive (although costly) control for such a randomness factor is to replicate the study. This issue is general and it pertains to all analyses that involve “multiple comparisons and statistical significance.” This problem is also briefly discussed in the context of post-hoc comparisons of means and the Breakdowns option.

“Correlations” Casewise vs. Pairwise Deletion of Missing Data. The default way of deleting missing data while calculating a correlation matrix is to exclude all cases that have missing data in at least one of the selected variables; that is, by casewise deletion of missing data. Only this way will you get a “true” correlation matrix, where all correlations are obtained from the same set of observations. However, if missing data are randomly distributed across cases, you could easily end up with no “valid” cases in the data set, because each of them will have at least one missing data in some variable. The most common solution used in such instances is to use so-called pairwise deletion of missing data in correlation matrices, where a correlation between each pair of variables is calculated from all cases that have valid data on those two variables. In many instances there is nothing wrong with that method, especially when the total percentage of missing data is low, say 10%, and they are relatively randomly distributed between cases and variables. However, it may sometimes lead to serious problems.

For example, a systematic bias may result from a “hidden” systematic distribution of missing data, causing different correlation coefficients in the same correlation matrix to be based on different subsets of subjects. In addition to the possibly biased conclusions that you could derive from such “pairwise calculated” correlation matrices, real problems may occur when you subject such matrices to another analysis (e.g., multiple regression, factor analysis, or cluster analysis) that expects a “true correlation matrix,” with a certain level of consistency and “transitivity” between different coefficients. Thus, if you are using the pairwise method of deleting the missing data, be sure to examine the distribution of missing data across the cells of the matrix for possible systematic “patterns.”

“Correlations” How to Identify Biases Caused by the Bias due to Pairwise Deletion of Missing Data. If the pairwise deletion of missing data does not introduce any systematic bias to the correlation matrix, then all those pairwise descriptive statistics for one variable should be very similar. However, if they differ, then there are good reasons to suspect a bias. For example, if the mean (or standard deviation) of the values of variable A that were taken into account in calculating its correlation with variable B is much lower than the mean (or standard deviation) of those values of variable A that were used in calculating its correlation with variable C, then we would have good reason to suspect that those two correlations (A-B and A-C) are based on different subsets of data, and thus, that there is a bias in the correlation matrix caused by a non-random distribution of missing data.

“Correlations” Pairwise Deletion of Missing Data vs. Mean Substitution. Another common method to avoid loosing data due to casewise deletion is the so-called mean substitution of missing data (replacing all missing data in a variable by the mean of that variable). Mean substitution offers some advantages and some disadvantages as compared to pairwise deletion. Its main advantage is that it produces “internally consistent” sets of results (“true” correlation matrices). The main disadvantages are:

Mean substitution artificially decreases the variation of scores, and this decrease in individual variables is proportional to the number of missing data (i.e., the more missing data, the more “perfectly average scores” will be artificially added to the data set).

Because it substitutes missing data with artificially created “average” data points, mean substitution may considerably change the values of correlations.

“Correlations” Spurious Correlations. Although you cannot prove causal relations based on correlation coefficients (see Elementary Concepts), you can still identify so-called spurious correlations; that is, correlations that are due mostly to the influences of “other” variables. For example, there is a correlation between the total amount of losses in a fire and the number of firemen that were putting out the fire; however, what this correlation does not indicate is that if you call fewer firemen then you would lower the losses. There is a third variable (the initial size of the fire) that influences both the amount of losses and the number of firemen. If you “control” for this variable (e.g., consider only fires of a fixed size), then the correlation will either disappear or perhaps even change its sign. The main problem with spurious correlations is that we typically do not know what the “hidden” agent is. However, in cases when we know where to look, we can use partial correlations that control for (partial out) the influence of specified variables.

“Correlations” Are correlation coefficients “additive?” No, they are not. For example, an average of correlation coefficients in a number of samples does not represent an “average correlation” in all those samples. Because the value of the correlation coefficient is not a linear function of the magnitude of the relation between the variables, correlation coefficients cannot simply be averaged. In cases when you need to average correlations, they first have to be converted into additive measures. For example, before averaging, you can square them to obtain coefficients of determination which are additive (as explained before in this section), or convert them into so-called Fisher z values, which are also additive.

“Correlations” How to Determine Whether Two Correlation Coefficients are Significant. A test is available that will evaluate the significance of differences between two correlation coefficients in two samples. The outcome of this test depends not only on the size of the raw difference between the two coefficients but also on the size of the samples and on the size of the coefficients themselves. Consistent with the previously discussed principle, the larger the sample size, the smaller the effect that can be proven significant in that sample. In general, due to the fact that the reliability of the correlation coefficient increases with its absolute value, relatively small differences between large correlation coefficients can be significant. For example, a difference of .10 between two correlations may not be significant if the two coefficients are .15 and .25, although in the same sample, the same difference of .10 can be highly significant if the two coefficients are .80 and .90.

“Positive_correlation”>Positive correlation

In a positive correlation, as the values of one of the variables increase, the values of the second variable also increase. Likewise, as the value of one of the variables decreases, the value of the other variable also decreases. The example above of income and education is a positive correlation. People with higher incomes also tend to have more years of education. People with fewer years of education tend to have lower income.


1. An advantage of the correlation method is that we can make predictions about things when we know about correlations. If two variables are correlated, we can predict one based on the other. For example, we know that SAT scores and college achievement are positively correlated. So when college admission officials want to predict who is likely to succeed at their schools, they will choose students with high SAT scores.

We know that years of education and years of jail time are negatively correlated. Prison officials can predict that people who have spent more years in jail will need remedial education, not college classes.


1. The problem that most students have with the correlation method is remembering that correlation does not measure cause. Take a minute and chant to yourself: Correlation is not Causation! Correlation is not Causation! I always have my in-class students chant this, yet some still forget this very crucial principle.

We know that education and income are positively correlated. We do not know if one caused the other. It might be that having more education causes a person to earn a higher income. It might be that having a higher income allows a person to go to school more. It might also be some third variable.

A correlation tells us that the two variables are related, but we cannot say anything about whether one caused the other. This method does not allow us to come to any conclusions about cause and effect.

Place Your Order Now
Academic Writing Services:

Business / Professional Writing Services:

Free Essay Tips / Writing Guides:
100% Satisfaction Guarantee

We will revise your paper until you are completely satisfied. Moreover, you are free to request a different writer to rewrite your paper entirely, should you be unhappy with the writing style, level of research, communication, etc.

100% Authentic Research & Writing Guarantee

We guarantee that you will receive a fully authentic, 100% non-plagiarized work. Otherwise, we will just give you your money back.

100% Confidentiality & Privacy Guarantee

No one will ever find out that you have used our service. We guarantee that your personal information as well as any other data related to your order(s) will remain confidential to the extent allowed by law. It will not be shared with any third party unless you provide a written consent.