eCourse Main Page

Home | About | Glossary | Contact Us

Glossary  

 ageing society
  When the percentage of population of people over 65 years old is more than seven percent, that society will be called an "ageing society" (United Nations)
 bar chart
  A bar chart represents the absolute or relative frequencies for different categories of a qualitative variable by means of bars. The lengths of the bars are proportional to the values that they represent. Sometimes the bars are divided into at least two parts in order to take more than one qualitative variable into account. This leads to stacked bar charts.  An alternative for visualizing multivariate data on qualitative variables are clustered bar charts.  These result by juxtaposing the bars for categories belonging to different variables. A stacked or a clustered bar chart can for example be used for showing the preference of voters for different political parties by sex. The stacked bar chart divides the bars for each party into two parts whereas the clustered bar charts juxtapose the two components.
 birth rate
  The birth rate is the number of live births during a certain year as a proportion of the total population.
 boxplot
  A boxplot is a graphical tool that summarizes the information contained in a data set for continuous quantitative variables by displaying five characteristics of the set. The five characteristics are the extreme values (minimum and maximum), the lower and upper quartiles x0.25 and x0.75 and the → median x0.5.  The graph comprises a box that is connected with the extreme values by means of lines (so-called whiskers). The length of the box is defined by the difference  x0.75 –  x0.75  difference between  upper quartile and  lower  quartile of the data set. The median x0.5 is marked inside the box. The total length of a boxplot (box and whiskers) represents the range of the data set.
 census of population
  Census denotes the procedure of systematically collecting statistical information on all people of a country. Due to the huge amount of personal and economic resources needed for carrying out a Census, it is rather seldom applied by National Statistical Offices. The United Nations recommend to repeat a Census every 10 years.
 clustered bar charts
  Sometimes the bars of a bar chart are divided into at least two parts in order to take more than one qualitative variable into account. Juxtaposing the partial bars for categories belonging to different variables leads to clustered bar charts.  An alternative for visualizing multivariate data on qualitative variables are   stacked bar charts.  These result by laying one partial bar on top of the other.
 cluster sampling
  uses a “natural” partition of a population into subgroups as a starting point. The natural subgroups are called clusters.  In step 1, a random sample of clusters is drawn. Step 2 comprises the collection of data for all elements of the selected clusters
 composite indicators
  Composite indicators represent linear combination of various single indicators. The main problem in using these “synthetic indicators is to find non-subjective weights for the single indicators.  Often, equal weights are employed due to lack of a weight scheme that is derived from data.
 contingency tables
  A contingency table or two-way frequency table is a tabular representation of a bivariate data set containing absolute or relative frequencies for the categories of two qualitative variables. The table shows the observed frequencies for all combinations of categories.
 data
  observations belonging to a variable, i. e. to a characteristic whose value may vary.
 demography
  is the statistical study of all populations. It can be a very general science that can be applied to any kind of dynamic population, that is, one that changes over time or space.
 European Innovation Scoreboard
  The European Innovation Scoreboard  is an example for a composite indicator. It aims at measuring innovation in Europe by using a linear combination of equally weighted single indicators.
 fertility rate
  The mean number of children that would be born alive to a woman during her lifetime if she were to pass through her childbearing years conforming to the fertility rates by age of a given year. This rate is therefore the completed fertility of a hypothetical generation, computed by adding the fertility rates by age for women in a given year (the number of women at each age is assumed to be the same). The total fertility rate is also used to indicate the replacement level fertility; in more highly developed countries, a rate of 2.1 is considered to be replacement level.
 histogram
  A histogram is a graphical tool that provides a rough approximation of the frequency distribution of a data set. The data belonging to a data set are grouped into intervals (bands, bins), usually intervals of equal size. The histogram presents the absolute or relative number of elelments of the data set belonging to the different categories in form of bars.
 ISCED
  ISCED denotes an international classification scheme for educational levels that has been developed by the UNESCO (International Standard Classification of Education).
 labour force
  The labour force or "currently active population" comprises all persons who fulfil the requirements for inclusion among the employed or the unemployed.
 life expectancy
  Life Expectancy denotes the average lifespan of an individual.
 life expectancy at certain ages
  The mean number of years still to be lived by a person who has reached a certain exact age, if subjected throughout the rest of his or her life to the current mortality conditions (age-specific probabilities of dying).
 line graph
  A line graph or time series graph represents a widely used graphical tool employed for visualizing data on quantitative variables that are observed at different points of time.
The development of a stock market index during a trading day or the unemployment rates for European countries for the last decade, can, for example, be displayed by means of line graphs.
 live expectancy at birth
  Life expectancy denotes the average life span of a newborn baby. It depends on life and health conditions.
 macro level
  Macro level refers to a large population, for example to earnings of all employees of a country. Contrary to this, the micro level refers to small populations or even individuals, for example to earnings of employees of a company or of a single employee.
 mean
  The mean is a measure of centrality that, contrary to the median, reacts sensitive with respect to outliers. For a data set of size n for a quantitative variable, it is defined as the sum of all elements of the data set, divided by n.
 median
  The median is a measure of centrality. It separates the lowest and the largest 0.5 ∙ 100 % values of the data set. In order to calculate the median, the elements of a given data set first need to be ranked by increasing order. For an odd-sized set, the middle position of the ordered set is uniquely defined and represents the median. In case of an even number of entries,  there are two elements holding the middle position and the median results to be the arithmetic mean of these two elements.
 millenium development goals
  The Millenium Development Goals denotes an action plan of the United Nations that was adopted in 2000 and re-confirmed in 2008. It includes 8 goals aiming at considerably reducing poverty and health in developing countries by 2015.
 mode
  The mode or modal value of a data set denotes the value that occurs most frequently. Such a most common value is not necessarily uniquely defined.
 mortality rate
  The mortality rate denotes the number of people who have died during a year as a proportion of the total population.
 net migration
  Difference between immigration and emigration or in-migration and out-migration for a given area and period of time.
 net migration rate
  The net migration rate is the net migration figure as a proportion of the total population. The net migration figure or migration excess results by deducting the number of those who have moved out of an area (out-migration) from the number of those who have moved into that same area (in-migration)
 old-age dependency ratio
  The old-age dependency ratio is the ratio of the number of elderly persons at an age when they are generally economically inactive divided by the number of persons of working age.
 operationalisation
  denotes the process of searching for a measurable variable that satisfactorily approximates a non-measurable latent variable.
 pie chart
  A pie chart uses a circle (“pie”) for representing the total of the absolute or relative frequencies that have been observed for the categories of a qualitative variable. The individual frequencies are mirrored by sectors (“slices”).  The size of the angle of a sector corresponds to  the size of the corresponding frequency. Employing a pie chart without presenting the data behind implies loosing the information related to the absolute values. Another disadvantage of pie charts is that slices of similar size are often difficult to distinguish.
 population
  The entire set of elements for which statistical information is desired, defines a population.  Examples are the group of tourists that visited Malta in 2009, the totality of babies born in France in January 2010 or the set of beer bottles produced by a German brewery on a specific day.  Hence, it needs to be clearly defined whether an element belongs to a population or not.
 population change
  is the difference between the size of the population at the end and the beginning of a period. It is equal to the algebraic sum of natural increase and net migration (including corrections). There is negative change when both of these components are negative or when one is negative and has a higher absolute value than the other.
 population density
  is the ratio between (total) population and surface (land) area. This ratio can be calculated for any territorial unit for any point in time, depending on the source of the population data. In the domain DEMO-R in New Cronos the population density is calculated using the average (mid-year) population.
 Population Growth Rate (PGR)
  is the fractional rate at which the number of individuals in a population increases. Specifically, PGR ordinarily refers to the change in population over a unit time period, often expressed as a percentage of the number of individuals in the population at the beginning of that period.
 population pyramid
  A Population Pyramid, also called age pyramid, denotes a graphical instrument that usually displays data grouped by year of birth or by age groups compromising more than one year. The size of the age group is represented by horizontal bars, showing the number of females to the right and of males on the left.
 primary data collection
  data are collected for the first time by a public or non-public data producer, for example a Statistical Office or a Marketing Research Institute.
 qualitative or non-numerical variables
  Qualitative or non-numerical variables represent categories. If the categories represent only names or labels that can not be ordered, the data belonging to the variables are called nominal data.  For nominal data, only the frequencies for the categories can be counted.  If the categories can be ranked, the data belonging to the variables are called ordinal data.
Data on the religious affiliation are nominal whereas data on the highest successfully completed educational level of an individual are ordinal because the levels can be ranked.
 quantile
  A quantile of a data set is a value that separates the lowest p ∙ 100 % values and the largest (1-p) ∙ 100 % of the data set. Special cases are the   median (p = 0.5), the lower quartile (p = 0.25), the   upper quartile (p = 0.75) and the   deciles (p = 0.1, p = 0.2, … , p = 0.9). The deciles are often denoted by D1, D2, …, D9.  Obviously D5 coincides with the median.
 quantitative or numerical variables
  Quantitative or numerical variables represent amounts, contrary to   qualitative or non-numerical variables.  They can be either discrete (values represent isolated points) or continuous (values represent an entire interval). The size of an enterprise in terms of the number of employees represents a discrete variable, whereas the height of an individual is continuous.
 quartile
  A quartile of a data set is a value that separates the lowest p ∙ 100 % values and the largest (1-p) ∙ 100 % of the data set with p = 0.25 (lower quartile), p = 0,5 (median) or p = 0.75 (upper quartile. Quartiles represent special cases of quantiles.
 random sampling
  denotes any method where the elements of a sample are randomly selected.
 range
  For a data set of size n for a quantitative variable, the range is defined as the difference of the largest and the smallest element of the data set.  The range is a measure of spread that, contrary to the standard deviation and the variance, is unaffected by data changes as long as the extreme values of the data set remain the same. A useful spread measure complementing the range is the interquartile range IQR.  The IQR is the difference of upper and lower quartile. Hence, it represents the range of the “inner” half of the data set ordered by increasing size.
 sample
  Very often, in particular for large populations, data are not collected for an entire population but only for a subset. Such a subset represents a sample.  The method of drawing a sample from a population defines the sampling design.
 sampling bias
  A sampling bias is any error connected with the procedure of sampling.  Examples are the problems of under- and over coverage or that of a response bias connected with non-random sampling.
 scatter plot
  A scatter plot or scatter diagram represents a graphical tool that is employed for visualizing  bivariate data sets consisting of observations for continuous quantitative variables.
 secondary data collection
  using already existing data in order to analyse, to provide interpretations or to use the data directly for evidence-based decision making. Newspapers usually cannot afford gaining statistical information by their own and make use of reliable data sources.
 simple random sampling
  A random sample is called a simple random sample if each subset of size n of the population used for sampling has the same probability of being chosen. If N denotes the size of the underlying population, the property above needs to be fulfilled for any entire figure below N.
 stacked bar charts
  Sometimes the bars of a bar chart are divided into at least two parts in order to take more than one qualitative variable into account. This leads to stacked bar charts.  An alternative for visualizing multivariate data on qualitative variables are clustered bar charts.  These result by juxtaposing the partial bars for categories belonging to different variables.
 standard deviation
  The standard deviation results by taking the square root from the variance.  It is a linear measure of spread that, contrary to the range, is unaffected by changes of all data as long as the extreme values of the data set remain the same.
 statistical indicators
  Statistical Indicators represent statistical information related to one or more variables for specified time points or regions. They are widely used in numerous fields of societal life, for example for describing socio-economic developments or for policy monitoring.
 statistical inference
  drawing conclusions about a population from a ample drawn from it, In most cases, one uses random samples. These are drawn by applying a non-systematic selection procedure.
 strata
  Strata  (singular: stratum) are relatively homogenous subgroups of a population. Stratified sampling is a method of random sampling that applies the sampling procedure to the subgroups.
 stratified sampling
  The approach of dividing a population into subgroups (so-called strata) and to draw random samples from each stratum, is called stratified sampling.
 total dependency ratio
  is the ratio of the youth and elderly population to the working age population. Youth is defined as people aged 0-14yrs of age. Elderly is defined as people aged 65 years and older. Working age is defined as people aged 15-64 years of age.
 variable
  A variable is a characteristic with non-constant values. The values can be either amounts (quantitative or numerical variables) or categories (qualitative or non-numerical variables).
 variance
  The variance is a quadratic measure of variability. For a data set of size n for a quantitative variable, it is defined as the calculated by dividing the sum of the squared deviations of the data from the mean by the size n of the elements of the data set.One often uses a slightly modified formula for the variance that results by dividing the sum of the squared deviations by n – 1 instead by n.  (The corrected version is advantageous when being applied for estimating purposes.)
 young dependency ratio
  The young dependency ratio includes only under 15s, and the elderly dependency ratio focuses on those over 64. For example, if in a population of 1,000 there are 250 people under the age of 15 and 500 people between the ages of 15-64. The youth dependency ratio would be 50% (250/500).
 
   

home | disclaimer | privacy | how to find us

© National Statistics Office, Malta / Statistics Finland