Skinner Operant or Instrumental Conditioning Theory (S-R Theory with Reinforcement)

B.F. Skinner's Operant Conditioning or Instrumental Conditioning Theory of Learning proved the importance of reinforcement in learning rather than the connections being formed between stimuli and responses. The theory is called the operant conditioning as it is based on certain operations or actions which a person has to carry out. In classical theory of Pavlov, the dog was harnessed on a table and was passive. The dog performed no operations.

The presence of a stimulus is essential to evoke a response. The subject (e.g., child express fear only when he hears a loud noise and the dog waits for food to arrive salivation) had no control over the happening and is made to behave in response to the stimulus situations. But Skinner revolted against it and argued that in practical situation of our life we cannot wait for things to happen in the environment and it is not always essential that there must be some known stimuli or cause for evoking a response. In operant conditioning the subjects performed acts or 'carried out' operations and were active. A dog, a child or an individual does something, behaves in some manner, which, operates, on the environment which In turn responds to the activity. Based on the findings of his experiments Skinner concluded that "behaviour is shaped and maintained by its consequences. It is operated by the organism and maintained by its results. The occurrence of such behaviour was named operant behaviour and the process of learning that plays a part in learning such behaviour, was named as operant conditioning. More precisely operant conditioning refers to a kind of learning process where a response is made more probable or more frequent by reinforcement. It helps in the learning of operant behaviour, the behaviour that is not necessarily associated with a known stimulus.

The Experiments regarding Operant Conditioning

Skinner conducted a series of experiments with animals. For conducting the experiments with rats, he constructed a sound proof box which was equipped with a bar and a food tray.

He put a hungry rat in the box. It is so arranged that when the rat presses the lever, the feeder mechanism is activated, a light or a special sound is produced and a small pellet of food is released into the tray. All these activities were connected to a recording system. Thus the rat learned the task of pressing the bar more frequently when the food pellet reinforced the behaviour. Here, giving out the correct response is more Important and succeeded in changing the traditional S-R formula to RS formula. He believed that there are responses without known stimuli and these responses are called "emitted" responses. Responses to known stimuli are called elicited responses. He believes response first then stimulus and recognizes two kinds of reinforcers-positive and negative.

Mechanisms of Operant Conditioning

The Important thing in the mechanism of operant conditioning is the emitting of a desired response and its proper management through suitable reinforcement. Here the organism responds in a certain way so as to produce the reinforcing stimulus. The subsequent reinforcement gradually conditions the organism to emit the desired response and thus learn the desired act. The following are some of the mechanisms of operant conditioning:

(a) Shaping: It refers to the judicious use of selective reinforcement to bring certain desirable changes in the behaviour of the organism. Suppose we want to train a child for toilet training. Simply putting the child on the toilet is not successful because as soon as the child is placed on the stool, he/she begins to cry. To shape his/her behaviour, the child is given a chocolate whenever he/she is placed on the toilet. It has been observed that successful elimination follows, Similarly other techniques may also be used. In this way, shaping may be used as a successful technique for training individuals to learn difficult and complex behaviour and also for introducing desirable modification of their behaviour.

(b) Chaining: It is a sort of chain reaction where one object sparks the other object in its proximity and in turn causes sparking in the next object in the chain and so on. When we see someone we know, it is an effective stimulus for starting the chain responses. We greet him and he greets in response. His response to our greeting acts not only as a reward for our greeting but also as a stimulus for generating further response and in this way one generated response gives birth to another -response and so on indicates chaining.

(c) Generalisation: It is a fact that both animals and human beings are capable of generalizing experiences and knowledge acquired in one learning situation to another. Thus due care should be taken by the parents and teachers to reinforce the behaviour of the children only after they demonstrate the ability to generalize correctly.

(d) Discrimination: Ability of discrimination is very important in the behaviour formation. For example, in the Skinner box the animal learns to press the lever when the light is on and not to press it when the light is off. Thus the light becomes a clue or signal for the operant behaviour, I.e., the lever press response. Here the animal develops a discriminative operant which is an operant response extended to one set of circumstances but not to another. Similarly in the learning process the teacher should see that feedback is provided to the children as and when they are able to discriminate the good from bad or to provide proper feedback in correctness of his responses.

(e) Reinforcement: The concept of reinforcement is central in operant conditioning theory and is identical to the presentation of a reward. A reinforcer is the stimulus the presentation of which increases the probability of a response. Skinner used reinforcement as a procedure for controlling behaviour which produces stimulus response connection and recognizes two kinds of reinforcers-positive and negative.

Positive Reinforcement

A positive reinforcer is any stimulus (such as food, money, water, social approval, praise, knowledge of results) the introduction or presentation of which increases the likelihood of a particular behaviour. In the educational context, praise, grades, medals, and other prizes awarded to students are examples of positive reinforcers.

Negative reinforcement

A negative reinforcer is any stimulus or those unpleasant stimuli the removal or withdrawal (such as loud noise, electric shock, social disapproval, condemnation) of which increases the likelihood of a particular behaviour. In the educational context, a teacher saying to the students that whoever does drill work properly in the class would be exempted from homework act as a negative reinforcer.

The schedules of Reinforcement

The term schedule of reinforcement suggests the particular pattern according to which reinforcers follow responses.

Some important schedules of reinforcement are the following:

1. Continuous schedule of reinforcement: It is an arrangement of providing reinforcement after every correct response. In the teaching-learning process a student may be rewarded for every correct answer in the form of warm regards or praise or praise or approval indicates continuous schedule of reinforcement.

2. Fixed Interval reinforcement Schedule: In the fixed or periodic interval schedule the reinforcement is presented after a prescribed interval of time, i.e., every 2 minutes or 4 minutes. Here emphasis is not given on correct responses rather it is only at the expiry of the fixed interval reinforcement is given.

3. Fixed ratio reinforcement schedule: In this schedule the reinforcement is given after a fixed number of responses. It means the performance of the learner is important rather than anything else. A learner may be rewarded (receives a chocolate) after he/she answers a fixed number of questions say 2 or 3. Another example of fixed ratio schedule is a rat gets a pellet of food only after pressing the bar say 7 to 8 times.

4. Variable reinforcement schedule: When reinforcement is given at varying intervals of time or after a varying number of responses, is called a variable reinforcement schedule. In this case, reinforcement is intermittent or irregular. Here the learner does not know at which time he/she is going to be rewarded and consequently he/she remains motivated with a hope of reinforcement.

Implications of the Theory of Operant Conditioning

The following implications emerged from the theory of operant conditioning:

(i) The theory provides the basis for programmed Instruction. It is a kind of learning experience in which a programme takes the place of tutor for the students and leads him through a set of specified behaviours.

(ii) The theory has drawn attention to the Inadequacy and unsuitability of reinforcement procedure adopted in our educational system. Thus the element of reinforcement can be strengthened in the teaching- learning process.

(iii) Generally, in our schools, the desirable behaviour of the learners is not immediately reinforced to raise the probability of the recurrence of the same behaviour in future. Thus it suggests delay of reinforcement destroys the effect of reinforcing stimuli, hence, reinforcement should be given at an appropriate time and at each step.

(iv) The principle of operant conditioning can be applied in behaviour modification and ultimately desired behaviour can be strengthened depending upon the manipulation of reward.

(v) Operant conditioning emphasizes the importance of schedules in the process of reinforcement of behaviour.

(vi) Operant conditioning suggested appropriate alternatives to punishment in the form of rewarding appropriate behaviour and ignoring inappropriate behaviour, for its gradual extinction.

(vii) The root of mechanical learning in the form of teaching machines and computer assisted instruction have taken a shape in place of usual classroom instruction due to operant conditioning.

Utkarsh Education

March 30, 2023