What can be usefull in measuring powerlaw-type distribution

Because this is about ongoing research, I cannot reveal the exact case of using those statistical methods, but lets explain simply – we have an important dimension in the dataset which can be characterized as a powerlaw-distribution (long tail (at right) and short peak in the beginning (at left)). Because I am still learning statistics, which can be useful more than once, I want to write it down what I learned.

Regression analysis is a statistical tool for the investigation of relationships between variables. Usually, the investigator seeks to ascertain the causal effect of one variable upon another—the effect of a price increase upon demand, for example, or the effect of changes in the money supply upon the inflation rate. To explore such issues, the investigator assembles data on the underlying variables of interest and employs regression to estimate the quantitative effect of the causal variables upon the variable that they in influence. The investigator also typically assesses the “statistical significance” of the estimated relationships, that is, the degree of confidence that the true relationship is close to the estimated relationship.

Double logarithmic transformation – it is a log ( log ( x ) ) transformation. You can read here: http://stats.stackexchange.com/questions/298/in-linear-regression-when-is-it-appropriate-to-use-the-log-of-an-independent-va – to find out when it is appropriate to use logarithmic transformation instead of the actual values.

Normal distribution – the normal distribution is immensely useful because of the central limit theorem, which states that, under mild conditions, the mean of many random variables independently drawn from the same distribution is distributed approximately normally, irrespective of the form of the original distribution: physical quantities that are expected to be the sum of many independent processes (such as measurement errors) often have a distribution very close to the normal. Moreover, many results and methods (such as propagation of uncertainty and least squares parameter fitting) can be derived analytically in explicit form when the relevant variables are normally distributed.

Linear regression fit

Nonlinear LOESS regression fit – loess stands for locally estimated scatter-plot smoothing (lowess stands for locally weighted scatter-plot smoothing) and is one of many non-parametric regression techniques, but arguably the most flexible.



Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s