If you are new to clicker training you may well have a concern that you will always have to click and treat your dog for every cued behaviour he performs, this is however not the case.
Generally, when you are training a new behaviour, your dog will acquire the behaviour very quickly if you C/T for every correct response. This is called a fixed ratio reinforcement schedule of 1:1 or continuous reinforcement schedule; meaning that you will reinforce your dog for every response.
Once your dog knows the behaviour and it’s on cue, it’s time to introduce an intermittent reinforcement schedule. There are quite a few different schedules of reinforcement, but the most powerful schedule for strengthening a behaviour and preventing extinction is the the variable ratio reinforcement schedule.
As an example, let’s suppose you’ve taught your dog to sit on cue, and you now want to put the behaviour on a variable ratio reinforcement schedule.
The ratio is the percentage of sits which are rewarded and the variable is the number of sits in between reinforcements. First you decide what ratio, or percentage of sits you want to reinforce. This means that you reinforce 1 in 5 sits, or 1 in 10 sits, or 1 in 20 sits, whatever ratio you decide on. So let’s say you decide to reinforce 1 out of every 5 sits.
You then make sure that you average 1 reinforcement to every 5 sits, but that you vary the number of sits in between reinforcements (hence variable ratio - these names do make sense, sort of). It is very important not to reinforce the dog on every 5th sit, as he will see a pattern emerging. However, after a large number of sits he should have been reinforced on average once for every 5 sits.
So in 20 sits you would reinforce 4 times, but NOT on the 1st, 6th, 11th and 16th sits! You might reinforce the 2nd sit, reinforce the 9th sit, reinforce the 12th sit and then the 17th sit, in other words, the number of sits which don’t get reinforcement is different each time (variable), and the dog has no way of working out in advance which sit is going to be reinforced.
It would go like this:
|
SIT
|
SIT + R
|
SIT
|
SIT
|
SIT
|
|
SIT
|
SIT
|
SIT
|
SIT + R
|
SIT
|
|
SIT
|
SIT + R
|
SIT
|
SIT
|
SIT
|
|
SIT
|
SIT + R
|
SIT
|
SIT
|
SIT
|
A reinforcement schedule like this has the effect of making the behaviour really persistent. For example let’s say your dog is a jumper, loves jumping to greet people, sometimes this behaviour is reinforced through attention and petting, and sometimes it is ignored and not reinforced. Because this behaviour is intermittently reinforced it stays strong and does not become extinct.
It works with people too. People who love to gamble or play slot machines, keep doing so because they have won in the past, or on a few occasions. Las Vegas is the city of variable reinforcement schedule. this is the power of variable ratio reinforcement!
