|
ageing
society |
|
| |
When the percentage of population of people
over 65 years old is more than seven
percent, that society will be called an
"ageing society" (United Nations) |
|
bar chart |
| |
A bar chart represents
the absolute or relative frequencies for
different categories of a qualitative
variable by means of bars. The
lengths
of the bars are proportional
to the values that they represent.
Sometimes
the bars are divided into at least two parts
in order to take more than one qualitative
variable into account. This leads to
stacked bar charts. An alternative for
visualizing multivariate data on qualitative
variables are clustered bar charts.
These result by juxtaposing the bars for
categories belonging to different variables.
A stacked or a clustered bar
chart can for example be used for showing
the preference of voters for different
political parties by sex. The stacked bar
chart divides the bars for each party into
two parts whereas the clustered bar charts
juxtapose the two components. |
|
birth rate |
| |
The birth rate is the number of live births during a
certain year as a proportion of the total
population. |
|
boxplot |
| |
A boxplot is a graphical tool that
summarizes the information contained in a
data set for continuous
→ quantitative variables by
displaying five characteristics of the set.
The five characteristics are the extreme
values (minimum and maximum), the lower and
upper
→ quartiles x0.25 and x0.75
and the
→ median x0.5. The graph
comprises a box that is connected with the
extreme values by means of lines (so-called
whiskers). The length of the box is defined
by the difference x0.75 – x0.75
difference between upper quartile
and lower quartile of the data set. The
median x0.5 is marked inside the
box. The total length of a boxplot (box and
whiskers) represents the
→ range of the data set. |
|
census
of population |
| |
Census
denotes the procedure of systematically collecting statistical
information on all people of a country. Due to the huge amount of
personal and economic resources needed for carrying out a Census, it is
rather seldom applied by National Statistical Offices. The United
Nations recommend to repeat a Census every 10 years. |
| clustered
bar charts |
| |
Sometimes the bars of a bar chart are
divided into at least two parts in order to
take more than one qualitative variable into
account. Juxtaposing the partial bars for
categories belonging to different variables
leads to clustered bar charts. An
alternative for visualizing multivariate
data on qualitative variables are →
stacked bar charts. These result by laying
one partial bar on top of the other. |
|
cluster sampling |
| |
uses a “natural” partition of a population into subgroups
as a starting point. The natural subgroups are called clusters. In step
1, a random sample of clusters is drawn. Step 2 comprises the collection
of data for all elements of the selected clusters |
|
composite indicators |
| |
Composite indicators
represent linear combination of various single indicators. The main
problem in using these “synthetic indicators is to find non-subjective
weights for the single indicators. Often, equal weights are employed
due to lack of a weight scheme that is derived from data. |
|
contingency tables |
| |
A contingency table or two-way
frequency table is a tabular
representation of a bivariate data set
containing absolute or relative frequencies
for the categories of two qualitative
variables. The table shows the observed
frequencies for all combinations of
categories. |
|
data |
| |
observations belonging to a variable, i. e. to a
characteristic whose value may vary. |
|
demography |
| |
is the statistical study of all populations.
It can be a very general science that can be applied to any kind of
dynamic population, that is, one that changes over time or space. |
|
European Innovation
Scoreboard |
| |
The European Innovation Scoreboard is an example
for a composite indicator. It aims at measuring innovation in Europe by
using a linear combination of equally weighted single indicators. |
|
fertility rate |
| |
The mean number of
children that would be born alive to a woman during her lifetime if she
were to pass through her childbearing years conforming to the fertility
rates by age of a given year. This rate is therefore the completed
fertility of a hypothetical generation, computed by adding the fertility
rates by age for women in a given year (the number of women at each age
is assumed to be the same). The total fertility rate is also used to
indicate the replacement level fertility; in more highly developed
countries, a rate of 2.1 is considered to be replacement level. |
|
histogram |
| |
A histogram is a graphical tool that
provides a rough approximation of the frequency distribution of a data
set. The data belonging to a data set are grouped into intervals (bands,
bins), usually intervals of equal size. The histogram presents the
absolute or relative number of elelments of the data set belonging to
the different categories in form of bars. |
|
ISCED |
| |
ISCED
denotes an international classification
scheme for educational levels that has been
developed by the UNESCO (International
Standard
Classification of Education). |
|
labour
force |
| |
The labour force or "currently active
population" comprises all persons who fulfil
the requirements for inclusion among the
employed or the unemployed. |
|
life expectancy |
| |
Life Expectancy denotes the average lifespan
of an individual. |
|
life
expectancy at certain ages |
| |
The mean number of
years still to be lived by a person who has
reached a certain exact age, if subjected
throughout the rest of his or her life to
the current mortality conditions
(age-specific probabilities of dying). |
|
line graph |
| |
A line graph or time series graph
represents a widely used graphical tool
employed for visualizing data on
quantitative variables that are observed at
different points of time.
The development of a stock market index
during a trading day or the unemployment
rates for European countries for the last
decade, can, for example, be displayed by
means of line graphs. |
|
live
expectancy at birth |
| |
Life
expectancy denotes the average life span of a newborn baby. It depends
on life and health conditions. |
|
macro level |
| |
Macro level
refers to a large population, for example to
earnings of all employees of a country.
Contrary to this, the micro level
refers to small populations or even
individuals, for example to earnings of
employees of a company or of a single
employee. |
|
mean |
| |
The mean is a measure of centrality that, contrary
to the median, reacts sensitive with respect
to outliers. For a data set of size n for a
quantitative variable, it is defined as the
sum of all elements of the data set, divided
by n. |
|
median |
| |
The median is
a measure of centrality. It separates the lowest and the largest 0.5 ∙
100 % values of the data set. In order to calculate the median, the
elements of a given data set first need to be ranked by increasing
order. For an odd-sized set, the middle position of the ordered
set is uniquely defined and represents the median. In case of an even
number of entries, there are two elements holding the
middle position and the median results to be the arithmetic mean of
these two elements. |
|
millenium development goals |
| |
The Millenium
Development Goals denotes an action plan of the United Nations that was
adopted in 2000 and re-confirmed in 2008. It includes 8 goals aiming at
considerably reducing poverty and health in developing countries by
2015. |
|
mode |
| |
The mode or
modal value of a data set denotes the value that occurs most frequently.
Such a most common value is not necessarily uniquely defined. |
|
mortality rate |
| |
The mortality rate denotes the number of
people who have died during a year as a
proportion of the total population. |
|
net
migration |
| |
Difference between immigration and
emigration or in-migration and out-migration for a given area and period
of time. |
|
net migration
rate |
| |
The net
migration rate is the net migration figure as a proportion of the total
population. The net migration figure or migration excess results by
deducting the number of those who have moved out of an area
(out-migration) from the number of those who have moved into that same
area (in-migration) |
|
old-age
dependency ratio |
| |
The old-age dependency ratio is the ratio of
the number of elderly persons at an age when they are generally
economically inactive divided by the number of persons of working age. |
|
operationalisation |
| |
denotes the process of searching for a measurable
variable that satisfactorily approximates a non-measurable latent
variable. |
|
pie chart |
| |
A pie chart uses a
circle (“pie”) for representing the total of
the absolute or relative frequencies that
have been observed for the categories of a
qualitative variable. The individual
frequencies are mirrored by sectors
(“slices”). The size of the angle of a
sector corresponds to the size of the
corresponding frequency. Employing a pie
chart without presenting the data behind
implies loosing the information related to
the absolute values. Another disadvantage of
pie charts is that slices of similar size
are often difficult to distinguish. |
|
population |
| |
The entire set of elements for which statistical
information is desired, defines a population. Examples are the
group of tourists that visited Malta in 2009, the totality of babies
born in France in January 2010 or the set of beer bottles produced by a
German brewery on a specific day. Hence, it needs to be clearly defined
whether an element belongs to a population or not. |
|
population
change |
| |
is the difference
between the size of the population at the end and the beginning of a
period. It is equal to the algebraic sum of natural increase and net
migration (including corrections). There is negative change when both of
these components are negative or when one is negative and has a higher
absolute value than the other. |
|
population density |
| |
is the ratio between
(total) population and surface (land) area.
This ratio can be calculated for any
territorial unit for any point in time,
depending on the source of the population
data. In the domain DEMO-R in New Cronos the
population density is calculated using the
average (mid-year) population. |
|
Population Growth Rate (PGR) |
| |
is the fractional rate at which the number
of individuals in a population increases. Specifically, PGR ordinarily
refers to the change in population over a unit time period, often
expressed as a percentage of the number of individuals in the population
at the beginning of that period. |
|
population
pyramid |
| |
A Population
Pyramid, also called age pyramid, denotes a graphical instrument that
usually displays data grouped by year of birth or by age groups
compromising more than one year. The size of the age group is
represented by horizontal bars, showing the number of females to the
right and of males on the left. |
|
primary data collection |
| |
data are collected for the first time by a public or
non-public data producer, for example a Statistical Office or a
Marketing Research Institute. |
|
qualitative or non-numerical
variables |
| |
Qualitative or non-numerical variables represent
categories. If the categories represent only names or labels that can
not be ordered, the data belonging to the variables are called
nominal data. For nominal data, only the frequencies for the categories can
be counted. If the categories can be ranked, the data belonging to the
variables are called ordinal data. Data on the religious
affiliation are nominal whereas data on the highest successfully
completed educational level of an individual are ordinal because the
levels can be ranked. |
|
quantile
|
| |
A quantile of a data set is a value that
separates the lowest p ∙ 100 % values and
the largest (1-p) ∙ 100 % of the data set.
Special cases are the →
median (p = 0.5), the
→
lower quartile (p = 0.25), the →
upper quartile (p = 0.75) and the →
deciles (p = 0.1, p = 0.2, … , p = 0.9). The
deciles are often denoted by D1, D2, …, D9.
Obviously D5 coincides with the median. |
|
quantitative or numerical
variables |
| |
Quantitative or numerical
variables
represent amounts, contrary to →
qualitative or non-numerical
variables.
They can be either discrete (values represent isolated points) or
continuous (values represent an entire interval). The size of an
enterprise in terms of the number of employees represents a discrete
variable, whereas the height of an individual is continuous. |
|
quartile |
| |
A quartile of a data
set is a value that separates the lowest p ∙ 100 % values and the
largest (1-p) ∙ 100 % of the data set with p = 0.25 (lower quartile), p
= 0,5 (median) or p = 0.75 (upper quartile. Quartiles represent special
cases of
→ quantiles. |
|
random
sampling |
| |
denotes any method where the elements of a sample are
randomly selected. |
|
range |
| |
For a data set of size n for a quantitative variable, the
range is defined as the difference of the largest and the
smallest element of the data set. The range is a measure of spread
that, contrary to the standard deviation and the variance,
is unaffected by data changes as long as the extreme values of the data
set remain the same. A useful spread measure complementing the range is the
interquartile range IQR. The IQR is the difference of upper and
lower quartile. Hence, it represents the range of the “inner” half of
the data set ordered by increasing size. |
|
sample |
| |
Very often, in particular for large populations, data are
not collected for an entire population but only for a subset. Such a
subset represents a sample. The method of drawing a sample from
a population defines the sampling design. |
|
sampling bias |
| |
A sampling bias is any error connected with the
procedure of sampling. Examples are the problems of under- and over
coverage or that of a response bias connected with non-random sampling. |
|
scatter plot |
| |
A scatter plot or scatter diagram
represents a graphical tool that is
employed for visualizing bivariate data
sets consisting of observations for
continuous quantitative variables. |
|
secondary data collection |
| |
using already existing data in order to analyse, to
provide interpretations or to use the data directly for evidence-based
decision making. Newspapers usually cannot afford gaining statistical
information by their own and make use of reliable data sources. |
|
simple
random sampling |
| |
A random sample is
called a simple random sample if each
subset of size n of the population used for
sampling has the same probability of being
chosen. If N denotes the size of the
underlying population, the property above
needs to be fulfilled for any entire figure
below N. |
|
stacked bar
charts |
| |
Sometimes the bars of a bar
chart are
divided into at least two parts in order to
take more than one qualitative variable into
account. This leads to stacked bar charts.
An alternative for visualizing multivariate
data on qualitative variables are
→
clustered bar charts. These result by
juxtaposing the partial bars for categories
belonging to different variables. |
|
standard
deviation |
| |
The standard deviation results by taking the
square root from the variance. It is a linear measure of spread
that, contrary to the range, is unaffected by changes of all data as
long as the extreme values of the data set remain the same. |
|
statistical
indicators |
| |
Statistical Indicators represent
statistical information related to one or more variables for specified
time points or regions. They are widely used in numerous fields of
societal life, for example for describing socio-economic developments or for policy monitoring. |
|
statistical
inference |
| |
drawing
conclusions about a population from a ample drawn from it,
In most cases, one uses random samples.
These are drawn by applying a non-systematic
selection procedure. |
|
strata |
| |
Strata
(singular: stratum) are relatively
homogenous subgroups of a population.
Stratified sampling is a method of
random sampling
that applies the sampling procedure to the
subgroups. |
|
stratified sampling |
| |
The approach of dividing a population into subgroups
(so-called strata) and to draw random samples from each stratum,
is called stratified sampling. |
|
total
dependency ratio |
| |
is the ratio of the youth and elderly
population to the working age population. Youth is defined as people
aged 0-14yrs of age. Elderly is defined as people aged 65 years and
older. Working age is defined as people aged 15-64 years of age. |
|
variable |
| |
A variable is a
characteristic with non-constant values. The
values can be either amounts (quantitative
or numerical variables) or categories
(qualitative or non-numerical variables). |
|
variance |
| |
The variance is a quadratic measure of
variability. For a data set of size n for a quantitative variable, it is
defined as the calculated by dividing the sum of the squared
deviations of the data from the mean by the size n of the elements of
the data set.One often uses a slightly modified formula for the
variance that results by dividing the sum of the squared
deviations by n – 1 instead by n. (The corrected version is
advantageous when being applied for estimating purposes.) |
|
young
dependency ratio |
| |
The young dependency
ratio includes only under 15s, and the elderly dependency ratio focuses
on those over 64. For example, if in a population of 1,000 there are 250
people under the age of 15 and 500 people between the ages of 15-64. The
youth dependency ratio would be 50% (250/500). |
| |
| |
|