The Power of Varying Rewards

Most of us are familiar with the concepts of rewarding behaviors in order to make the dog produce them on cue. You say sit, the dog sits, and you give him a cookie. The dog is now more likely to sit in response to the word "sit" in the future.

We can also make behaviors more likely on “default,” like when your dog settles nicely at your feet under the coffeeshop table without being asked, and you slip him a little piece of chicken.

These are like the paychecks we give dogs. They do their job, they get paid, and they're more likely to do the same job next time. But here's the question that can really take your dog's reliability and precision to the next level: how good a job is he really doing each time, and how well are you motivating him with his paycheck?

For example, you may ask for a sit a dozen times during a training session, but not all sits are created equal. Some will be quicker, closer, or less crooked. However, you may be responding to them equally, with one little treat each time. That's hardly bad training, but you're missing out on a huge opportunity to get a better sit from your dog.

A few years ago, I attended a really fun and informative two-day seminar with Dr. Ian Dunbar, the influential animal behaviorist and dog trainer, and he illustrated these concepts really in a really elegant and practical series of games.

The basic takeaway is that once you start getting a behavior from a dog, from one as basic as sit to one as complicated as agility weaves, you should immediately start varying your reinforcement.

For the animal behavior science nerds out there: the idea of rewarding only the better responses and withholding rewards for inferior responses comes under the header of differential reinforcement, which is a set of theories about how to improve the precision and reliability of existing behaviors.

What does that mean on a more practical level? Stop rewarding all successful behaviors equally, especially if your dog knows them fairly well. Instead, match up the quality of the reward to the quality of the behavior. Save some really high value rewards for the very best examples of a behavior (e.g., the straightest, quickest sits), and forego rewarding mediocre examples of the behavior (e.g., the slower, more crooked sits). Think of it this way: if your dog is going to do a behavior 100 times, you're going to stop rewarding the 50 weakest, you're going to really throw a party for the 10 best, and you're going to give a normal reward for the 40 that are left. The hard part is swiftly evaluating the behavior as it happens and asking yourself, is this in the top 50%? (reward); the top 10? (high value reward); the bottom 50%? (no reward).

If that seems like you're going to be skipping a lot more rewards than you're used to, you might be right. If it seems like that might make the behavior less reliable, you're wrong.

Regular rewarding of behavior is fine when you're teaching something new, so it can "click" for a dog that he's doing something right, but if you are rewarding the vast majority of successful responses in the long term, you're not going to have a dog who responds as quickly and reliably as if the rewards become less frequent and more random. It's well demonstrated in behavioral science that rewarding randomly instead of every time produces more reliable behavior (Miltenberger 91).

So by non-rewarding the 50-60% of the weakest behaviors and really jackpotting the top few percent, you can accomplish three huge goals: you can become less reliant on treats, you can improve the quality of the behavior, and you can increase the reliability of the behavior.

Try it right now: go play a game with your dog where you're going to get 10 sits (or 10 recalls, 10 runs through the weave poles, etc.). Take a few normal treats and one amazing one. If you really think about what your dog's current average sit is like and carefully choose to reward only the above-average sits, with the amazing treat for the first one that's way above average, you'll be well on your way to tighter, more reliable behavior.

Sources:

Dunbar, Ian. "Reinforcement Schedules." Dogstar Daily. N.p., 9 Mar. 2010.

Miltenberger, Raymond G. "Schedules of Reinforcement." Behavior Modification: Principles and Procedures. 4th ed. Belmont, CA: Thomson Higher Education, 2008. 86-93.

The Power of Varying Rewards

Sign up with your email address to receive an e-mail notification when there is a new entry in the Journal.

The Marker

Balancing Risks in Socialization