ORGANISATION OF DATA

 Organisation of Data

Organisation of data refers to the arrangement of figures in such a form that comparison of the mass of similar data may be facilitated and further analysis may be possible.

Classification
Classification is the process of arranging things in groups or classes according to their resemblances and affinities and gives expression to the unity of attributes that may exist amongst a diversity of individuals.

Objectives of Classification

  • Simplification and Briefness
  • Utility
  • Distinctiveness
  • Comparability
  • Scientific arrangement
  • Attractive and effective

Characteristic of a Good Classification

  • Comprehensiveness
  • Clarity
  • Homogeneity
  • Suitability
  • Stability
  • Elastic

Basis of Classification

  • Geographical Classification This classification of data is based on the geographical or locational differences of the data.
  • Chronological Classification When data are classified on the basis of time, it is known as chronological classification.
  • Qualitative Classification This classification is according to qualities or attributes of the data.
    This classification may be of two types

    • Simple classification
    • Manifold classification
  • Quantitative or Numerical Classification Data are classified in to classes or groups on the basis of their numerical values. Quantitative classification is also called classification by variables.
  • Concept of Variable A characteristic or a phenomenon which is capable of being measured and changes its value overtime is called a variable.
    The variable may be either discrete or continuous

    • Discrete Variable These are those variables that increase in jumps or in compete numbers.
    • Continuous Variable Variable that assume a range of values or increase not in jumps but continuously or in fractions are called continuous variables.
  • Raw Data A mass of data in its crude form is called raw data.
Types of Statistical Series Statistical series are of two types

  • Individual Series These are those series in which the items are listed singly. These series may be presented in two ways
    • According to serial numbers
    • Ascending or descending order of data
  • Frequency Series Frequency series may be of two types
    • Discrete Series or Frequency Array It is that series in which data are presented in way that exact measurement of items are clearly shown. In this series there are no class intervals and a particular item in the series.
    •  Frequency Distribution It is that series in which items cannot be exactly measured. The items assume a range of values and are placed within the limits is called class interval.

Frequency distribution is also known as continuous series or series with class-intervals, or series of grouped data.

Tally Bar – This method of marking and counting is known as four and cross method. For example Marks of 30 students – 

30, 25,15,30,25,35,35,10,20,10,45, 20,10,40,20,10,30,25,20,15,30,20,15,15,25,20,15,35,20,25 

From These we construct data.

Highest = 45

Lowest = 10

Range = 35




Types of Frequency Distribution

  • Exclusive Series It is that series in which every class-interval excludes items corresponding to its upper limit.
  • Inclusive Series An inclusive series is that series which includes all items upto its upper limit.
  • Open End Series An open end series is that series in
    which lower limit of the first class-interval and the upper limit of last class- interval is missing like as below – 5, 20 and above
  • Cumulative Frequency Series It is that series in which the frequencies are continuously added corresponding to each class-interval in the series.
    There are two ways of converting this series into cumulative frequency series

    • Cumulative frequencies may be expressed on the basis of upper class limits of the class-intervals.
    • Cumulative frequencies may b expressed on the basis of lower class limits of the class-intervals.
  • Mid Values Frequency Series -  Mid value frequency series are those series in which we have only mid values of the class intervals and the corresponding frequencies.
  • Univariate Distribution -  The frequency distribution of a single variable is called a univariate distribution.
  • Bivariate Distribution - A bivariate distribution is the frequency distribution of two variables.
Q1. What are difference between univariate and bivariate frequency Distribution. 
Univariate Frequency Distribution



The word ‘Uni’ means one. A series of statistical data showing the frequency of only one variable is called Univariate Frequency Distribution. In other words, the frequency distribution of single variable is called Univariate Frequency Distribution. For example- income of people, marks scored by students, etc.  

Bivariate Frequency Distribution

The word ‘Bi’ means two. A series of statistical data showing the frequency of two variables simultaneously is called Bivariate Frequency Distribution. In other words, the frequency distribution of two variables is called Bivariate Frequency Distribution. For example- sales and advertisement expenditure, weight and height of individuals, etc.

Q2. Following are the figures of marks obtained by 40 students. You ae required to arrange them in ascending and in descending order.

15 18 16 14 10 6 5 3 8 7
22 18 14 19 17 8 6 4 10 3
12 16 15 13 11 10 18 22 14 19
11 18 22 14 25 21 17 8 9 10

Ans. 
Marks in Ascending Order:
3, 3, 4, 5, 6, 6, 7, 8, 8, 8, 9, 10, 10, 10, 10, 11, 11, 12, 13, 14, 14, 14, 14, 15, 15, 16, 16, 17, 17, 18, 18, 18, 18 ,19, 19, 21, 22, 22, 22, 25

Marks in Descending Order:
25, 22, 22, 22, 21, 19, 19, 18, 18, 18, 18, 17, 17, 16, 16, 15, 15, 14, 14, 14, 14, 13, 12, 11, 11, 10, 10, 10, 10, 9, 8, 8, 8, 7, 6, 6, 5, 4, 3, 3

Q3. Prepare a frequency table taking class intervals 20−24, 25−29, 30−34 and so on, from the following data:

21 20 55 39 48 46 36 54 42 30
29 42 32 40 34 31 35 37 52 44
39 45 37 33 51 53 52 46 43 47
41 26 52 48 25 34 37 33 36 27
54 36 41 33 23 39 28 44 45 38
Ans. 

Convert the following 'more than' cumulative frequency distribution into a 'less than' cumulative frequency distribution

Class-Interval (More than) 10 20 30 40 50 60 70 80
Frequency                             124 119 107 84 55 31 12 2




Exercises

1. Which of the following alternatives is true?

(i) The class midpoint is equal to:
(a) The average of the upper class limit and the lower class limit
(b) The product of upper class limit and the lower class limit
(c) The ratio of the upper class limit and the lower class limit
(d) None of the above
► (a) The average of the upper class limit and the lower class limit.

(ii) The frequency distribution of two variables is known as
(a) Univariate Distribution
(b) Bivariate Distribution
(c) Multivariate Distribution
(d) None of the above
► (b) Bivariate Distribution



(iii) Statistical calculations in classified data are based on
(a) the actual values of observations
(b) the upper class limits
(c) the lower class limits
(d) the class midpoints
► (d) the class midpoints

(iv) Under Exclusive method,
(a) the upper class limit of a class is excluded in the class interval
(b) the upper class limit of a class is included in the class interval


(c) the lower class limit of a class is excluded in the class interval


(d) the lower class limit of a class is included in the class interval
► (a) the upper class limit of a class is excluded in the class interval

(v) Range is the
(a) difference between the largest and the smallest observations
(b) difference between the smallest and the largest observations
(c) average of the largest and the smallest observations


(d) ratio of the largest to the smallest observation
► (a) difference between the largest and the smallest observations

2. Can there be any advantage in classifying things? Explain with an example from your daily life.

Answer

Yes, there are many advantages of classifying things. These are:
→ It saves our time and energy by making easy to locate a specific data.


→ It facilitates the analysis, tabulation and interpretation.
→ It makes data comparable.
→ It is also easy to summarise.
For example: We make specific notebook for each subject.

3. What is a variable? Distinguish between a discrete and a continuous variable.
Ans. 
Discrete Series
When values of all the units are arranged in groups which are exactly measurable. In these series, all the values of variable are divided into certain groups. In these series various values of the variable are represented along with their corresponding frequencies. For example, marks obtained by 10 students are given as follows 
A variable that can take any value, within a reasonable limit is called a continuous variable.
These variables assume a range of values or increase in fractions and not in jumps.
 For example- age, height, weight, etc.
Q4. Convert the following cumulative frequency distribution into simple frequency distribution 

Marks

No. of Students

0-5

55

5-10

51


10-15

43

15-20

28


20-25

16

25-30

6

30-35

0



0

1. Assertion (A): Organization of data refers to the process of arranging data into a meaningful order.
Reason (R): The purpose of organizing data is to facilitate further analysis and comparison.
(a) Both Assertion (A) and Reason (R) are true and Reason (R) is the correct explanation of ni Assertion (A). 
(b) Both Assertion (A) and Reason (R) are true but Reason (R) is not the correct explanation of Assertion (A). 
(c) Assertion (A) is true but Reason (R) is false. 
(d) Assertion (A) is false but Reason (R) is true. 
Answer: (a)
2. The difference between the upper limit of a class interval and the lower limit of the class interval next to it is known as:
a) Class width
b) Class mark
c) Frequency
d) Relative frequency
Ans (a) 

The graphical representation of a cumulative frequency distribution is known as:

a) Histogram    b) Frequency polygon

c) Ogive     d) Pie chart

Ans (c) 

2. Assertion (A): A Bivariate Frequency Distribution can be defined as the frequency distribution of two variable. 

Reason (R): A bivariate frequency distribution is used to calculate correlation between the two variables.

(a) A and R are true and R explains A.

(b) A and R are true and R does not explain

(c) A is true but R is false.

(d) A is false but R is true. Ans. b

3. Assertion (A): The raw data are summarised, and made comprehensible by classification.

Reason (R): The raw data consist of observations on variables.

((a) A and R are true and R explains A.

(b) A and R are true and R does not explain

(c) A is true but R is false.

(d) A is false but R is true. Ans. b

Q.4. Statement I: Suppose X is a variable that takes values 1/8, 1/16, 1/32, 1/64,... it is a discrete variable. Statement II: Once raw data are grouped into classes individual observations are not used in further calculations such standard deviation, etc. Instead, Class Mark is used to represent the class.

(a) Both the statements are true.

(b) Both the statements are false.

(c) Statement I is true, Statement II is false.

(d) Statement II is true, Statement I is false.Ans. a

Q.5. Statement I: Classification brings order to raw data.

Statement II: In a Frequency Distribution, further statistical calculations are based only on the class mark values, instead of the observations.

(a) Both the statements are true.

(b) Both the statements are false.

(c) Statement I is true, Statement II is false.

(d) Statement II is true, Statement I is false. Ans. a

Q.6 Statement I: Either the upper class limit or the lower class limit is excluded in the Inclusive Method. Statement II: Both the upper and the lower class limits are included in the Exclusive Method.

(a) Both the statements are true.

(b) Both the statements are false.

(c) Statement I is true, Statement II is false.

d) Statement II is true, Statement I is false. Ans b

. Q.7 Statement I: A Frequency Distribution shows how the different values of a variable are distributed in different classes along with their corresponding class frequencies

Statement II: The classes should be formed in such a way that the class mark of each class comes as close as possible. around which the observations in a class tend to concentrate.

(a) Both the statements are true.

(b) Both the statements are false.

(c) Statement I is true, Statement II is false.

(d) Statement II is true, Statement I is false. Ans. a

Q.8. Statement I: In a Chronological Classification, data are classified either in ascending or in descending order with refere such as years.

Statement II: In spatial classification the data are classified with reference to geographical location such as countries, states, cities, districts, etc. 

(a) Both the statements are true.

(b) Both the statements are false.

(c) Statement I is true, Statement II is false.

(d) Statement II is true, Statement I is false. Ans. a


Comments

Popular posts from this blog

UNIT- 1 NATIONAL INCOME AND RELATED AGGREGATES

Indian Economy on the Eve of Independence NCERT Solution

INDIAN ECONOMY 1950-1990