Most classes create objects that are essentially just data structures with several internal data fields (instance variables) plus methods to manipulate them.
Perl classes inherit methods, not data, but as long as
all access to the object is through method calls anyway, this works
out fine. If you want data inheritance, you have to effect it through
method inheritance. By and large, this is not a necessity in Perl,
because most classes store the attributes of their object in an
anonymous hash. The object's instance data is contained within this
hash, which serves as its own little namespace to be carved up by
whatever classes do something with the object. For example, if you
want an object called $city
to have a data field
named elevation
, you can simply access
$city->{elevation}
. No declarations are
necessary. But method wrappers have their uses.
Suppose you want to implement a Person
object. You decide to have a data field called "name", which by a
strange coincidence you'll store under the key name
in the anonymous hash that will serve as the object. But you don't
want users touching the data directly. To reap the rewards of
encapsulation, users need methods to access that instance variable
without lifting the veil of abstraction.
For example, you might make a pair of accessor methods:
sub get_name { my $self = shift; return $self->{name}; } sub set_name { my $self = shift; $self->{name} = shift; }
which leads to code like this:
$him = Person->new(); $him->set_name("Frodo"); $him->set_name( ucfirst($him->get_name) );
You could even combine both methods into one:
sub name { my $self = shift; if (@_) { $self->{name} = shift } return $self->{name}; }
This would then lead to code like this:
$him = Person->new(); $him->name("Frodo"); $him->name( ucfirst($him->name) );
The advantage of writing a separate function for each
instance variable (which for our Person
class might
be name, age, height, and so on) is that it is direct, obvious, and
flexible. The drawback is that every time you want a new class, you
end up defining one or two nearly identical methods per instance
variable. This isn't too bad for the first few, and you're certainly
welcome to do it that way if you'd like. But when convenience is
preferred over flexibility, you might prefer one of the techniques
described in the following sections.
Note that we will be varying the implementation, not the interface. If users of your class respect the encapsulation, you'll be able to transparently swap one implementation for another without the users noticing. (Family members in your inheritance tree using your class for a subclass or superclass might not be so forgiving, since they know you far better than strangers do.) If your users have been peeking and poking into the private affairs of your class, the inevitable disaster is their own fault and none of your concern. All you can do is live up to your end of the contract by maintaining the interface. Trying to stop everyone else in the world from ever doing something slightly wicked will take up all your time and energy--and in the end, fail anyway.
Dealing with family members is more challenging. If a
subclass overrides a superclass's attribute accessor, should it access
the same field in the hash, or not? An argument can be made either
way, depending on the nature of the attribute. For the sake of safety
in the general case, each accessor can prefix the name of the hash
field with its own classname, so that subclass and superclass can both
have their own version. Several of the examples below, including the
standard Struct::Class
module, use this
subclass-safe strategy. You'll see accessors resembling this:
sub name { my $self = shift; my $field = __PACKAGE__ . "::name"; if (@_) { $self->{$field} = shift } return $self->{$field}; }
In each of the following examples, we create a simple
Person
class with fields name
,
race
, and aliases
, each with an
identical interface but a completely different implementation. We're
not going to tell you which one we like the best, because we like them
all the best, depending on the occasion. And tastes differ. Some folks
prefer stewed conies; others prefer fissssh.
Objects don't have to be implemented as anonymous hashes. Any reference will do. For example, if you used an anonymous array, you could set up a constructor like this:
sub new { my $invocant = shift; my $class = ref($invocant) || $invocant; return bless [], $class; }
and have accessors like these:
sub name { my $self = shift; if (@_) { $self->[0] = shift } return $self->[0]; } sub race { my $self = shift; if (@_) { $self->[1] = shift } return $self->[1]; } sub aliases { my $self = shift; if (@_) { $self->[2] = shift } return $self->[2]; }
Arrays are somewhat faster to access than hashes and don't take up quite as much memory, but they're not at all convenient to use. You have to keep track of the index numbers (not just in your class, but in your superclass, too), which must somehow indicate which pieces of the array your class is using. Otherwise, you might reuse a slot.
The use fields
pragma addresses all of
these points:
package Person; use fields qw(name race aliases);
This pragma does not create accessor methods for you,
but it does rely on some built-in magic (called
pseudohashes) to do something similar. (You may
wish to wrap accessors around the fields anyway, as we do in the
following example.) Pseudohashes are array references that you can
use like hashes because they have an associated key map table. The
use fields
pragma sets this key map up for you,
effectively declaring which fields are valid for the
Person
object; this makes the Perl compiler aware
of them. If you declare the type of your object variable (as in
my Person $self
, in the next example), the
compiler is smart enough to optimize access to the fields into
straight array accesses. Perhaps more importantly, it validates
field names for type safety (well, typo safety, really) at compile
time. (See Section
8.3.5 in Chapter
8.)
A constructor and sample accessors would look like this:
package Person; use fields qw(name race aliases); sub new { my $type = shift; my Person $self = fields::new(ref $type || $type); $self->{name} = "unnamed"; $self->{race} = "unknown"; $self->{aliases} = []; return $self; } sub name { my Person $self = shift; $self->{name} = shift if @_; return $self->{name}; } sub race { my Person $self = shift; $self->{race} = shift if @_; return $self->{race}; } sub aliases { my Person $self = shift; $self->{aliases} = shift if @_; return $self->{aliases}; } 1;
If you misspell one of the literal keys used to access the
pseudohash, you won't have to wait until run time to learn about
this. The compiler knows what type of object
$self
is supposed to refer to (because you told
it), so it can check that the code accesses only those fields that
Person
objects actually have. If you have horses
on the brain and try to access a nonexistent field (such as
$self->{mane}
), the compiler can flag this
error right away and will never turn the erroneous program over to
the interpreter to run.
There's still a bit of repetition in declaring methods to get at instance variables, so you still might like to automate the creation of simple accessor methods using one of the techniques below. However, because all these techniques use some sort of indirection, if you use them, you will lose the compile-time benefits of typo-checking lexically typed hash accesses. You'll still keep the (small) time and space advantages, though.
If you do elect to use a pseudohash to implement your
class, any class that inherits from this one must be aware of that
underlying pseudohash implementation. If an object is implemented as
a pseudohash, all participants in the inheritance hierarchy should
employ the use base
and use
fields
declarations. For example,
package Wizard; use base "Person"; use fields qw(staff color sphere);
This makes the Wizard
module a subclass of
class Person
, and loads the
Person.pm file. It also registers three new
fields in this class to go along with those from Person
.
That way when you write:
my Wizard $mage = fields::new("Wizard");
you'll get a pseudohash object with access to both classes' fields:
$mage->name("Gandalf"); $mage->color("Grey");
Since all subclasses must know that they are using a pseudohash implementation, they should use the direct pseudohash notation for both efficiency and type safety:
$mage->{name} = "Gandalf"; $mage->{color} = "Grey";
If you want to keep your implementations interchangeable, however, outside users of your class must use the accessor methods.
Although use base
supports only single
inheritance, this is seldom a severe restriction. See the
descriptions of use base
and use
fields
in Glossary.
The standard Class::Struct
module
exports a function named struct
. This creates all
the trapping you'll need to get started on an entire class. It
generates a constructor named new
, plus accessor
methods for each of the data fields (instance variables) named in
that structure.
For example, if you put the class in a Person.pm file:
package Person; use Class::Struct; struct Person => { # create a definition for a "Person" name => '$', # name field is a scalar race => '$', # race field is also a scalar aliases => '@', # but aliases field is an array ref }; 1;
Then you could use that module this way:
use Person; my $mage = Person->new(); $mage->name("Gandalf"); $mage->race("Istar"); $mage->aliases( ["Mithrandir", "Olorin", "Incanus"] );
The Class::Struct
module created all four
of those methods. Because it follows the subclass-safe policy of
always prefixing the field name with the class name, it also permits
an inherited class to have its own separate field of the same name
as a base class field without conflict. That means in this case that
"Person::name
" rather than just
"name
" is used for the hash key for that
particular instance variable.
Fields in a struct
declaration don't have
to be basic Perl types. They can also specify other classes, but
classes created with struct
work best because the
function makes assumptions about how the classes behave that aren't
generally true of all classes. For example, the
new
method for the appropriate class is invoked
to initialize the field, but many classes have constructors with
other names.
See the description of Class::Struct
in
Chapter 32, and its online
documentation for more information. Many standard modules use
Class::Struct
to implement their classes,
including User::pwent
and
Net::hostent
. Reading their code can prove
instructive.
As we mentioned earlier, when you invoke a
nonexistent method, Perl has two different ways to look for an
AUTOLOAD
method, depending on whether you
declared a stub method. You can use this property to provide access
to the object's instance data without writing a separate function
for each instance. Inside the AUTOLOAD
routine,
the name of the method actually invoked can be retrieved from the
$AUTOLOAD
variable. Consider the following
code:
use Person; $him = Person->new; $him->name("Aragorn"); $him->race("Man"); $him->aliases( ["Strider", "Estel", "Elessar"] ); printf "%s is of the race of %s. ", $him->name, $him->race; print "His aliases are: ", join(", ", @{$him->aliases}), ". ";
As before, this version of the Person
class
implements a data structure with three fields:
name
, race
, and
aliases
:
package Person; use Carp; my %Fields = ( "Person::name" => "unnamed", "Person::race" => "unknown", "Person::aliases" => [], ); # The next declaration guarantees we get our own autoloader. use subs qw(name race aliases); sub new { my $invocant = shift; my $class = ref($invocant) || $invocant; my $self = { %Fields, @_ }; # clone like Class::Struct bless $self, $class; return $self; } sub AUTOLOAD { my $self = shift; # only handle instance methods, not class methods croak "$self not an object" unless ref($invocant); my $name = our $AUTOLOAD; return if $name =~ /::DESTROY$/; unless (exists $self->{$name}) { croak "Can't access `$name' field in $self"; } if (@_) { return $self->{$name} = shift } else { return $self->{$name} } }
As you see, there are no methods named
name
, race
, or
aliases
anywhere to be found. The
AUTOLOAD
routine takes care of all that. When
someone uses $him->name("Aragorn")
, the
AUTOLOAD
subroutine is called with
$AUTOLOAD
set to
"Person::name
". Conveniently, by leaving it fully
qualified, it's in exactly the right form for accessing fields of
the object hash. That way if you use this class as part of a larger
class hierarchy, you don't conflict with uses of the same name in
other classes.
Most accessor methods do essentially the same thing: they simply fetch or store a value from that instance variable. In Perl, the most natural way to create a family of near-duplicate functions is looping around a closure. But closures are anonymous functions lacking names, and methods need to be named subroutines in the class's package symbol table so that they can be called by name. This is no problem--just assign the closure reference to a typeglob of the appropriate name.
package Person; sub new { my $invocant = shift; my $self = bless({}, ref $invocant || $invocant); $self->init(); return $self; } sub init { my $self = shift; $self->name("unnamed"); $self->race("unknown"); $self->aliases([]); } for my $field (qw(name race aliases)) { my $slot = __PACKAGE__ . "::$field"; no strict "refs"; # So symbolic ref to typeglob works. *$field = sub { my $self = shift; $self->{$slot} = shift if @_; return $self->{$slot}; }; }
Closures are the cleanest hand-rolled way to create a
multitude of accessor methods for your instance data. It's efficient
for both the computer and you. Not only do all the accessors share
the same bit of code (they only need their own lexical pads), but
later if you decide to add another attribute, the changes required
are minimal: just add one more word to the for
loop's list, and perhaps something to the init
method.
So far, these techniques for managing instance data have offered no mechanism for "protection" from external access. Anyone outside the class can open up the object's black box and poke about inside--if they don't mind voiding the warranty. Enforced privacy tends to get in the way of people trying to get their jobs done. Perl's philosophy is that it's better to encapsulate one's data with a sign that says:
IN CASE OF FIRE BREAK GLASS
You should respect such encapsulation when possible, but still have easy access to the contents in an emergency situation, like for debugging.
But if you do want to enforce privacy, Perl isn't about to get in your way. Perl offers low-level building blocks that you can use to surround your class and its objects with an impenetrable privacy shield--one stronger, in fact, than that found in many popular object-oriented languages. Lexical scopes and the lexical variables inside them are the key components here, and closures play a pivotal role.
In Section 12.5.5 we saw how a class can use closures to implement methods that are invisible outside the module file. Later we'll look at accessor methods that regulate class data so private that not even the rest of the class has unrestricted access. Those are still fairly traditional uses of closures. The truly interesting approach is to use a closure as the very object itself. The object's instance variables are locked up inside a scope to which the object alone--that is, the closure--has free access. This is a very strong form of encapsulation; not only is it proof against external tampering, even other methods in the same class must use the proper access methods to get at the object's instance data.
Here's an example of how this might work. We'll use closures both for the objects themselves and for the generated accessors:
package Person; sub new { my $invocant = shift; my $class = ref($invocant) || $invocant; my $data = { NAME => "unnamed", RACE => "unknown", ALIASES => [], }; my $self = sub { my $field = shift; ############################# ### ACCESS CHECKS GO HERE ### ############################# if (@_) { $data->{$field} = shift } return $data->{$field}; }; bless($self, $class); return $self; } # generate method names for my $field (qw(name race aliases)) { no strict "refs"; # for access to the symbol table *$field = sub { my $self = shift; return $self->(uc $field, @_); }; }
The object created and returned by the new
method is no longer a hash, as it was in other constructors we've
looked at. It's a closure with unique access to the attribute data
stored in the hash referred to by $data
. Once the
constructor call is finished, the only access to
$data
(and hence to the attributes) is via the
closure.
In a call like $him->name("Bombadil")
,
the invoking object stored in $self
is the
closure that was blessed and returned by the constructor. There's
not a lot one can do with a closure beyond calling it, so we do just
that with $self->(uc $field, @_)
. Don't be
fooled by the arrow; this is just a regular indirect function call,
not a method invocation. The initial argument is the string
"name
", and any remaining arguments are whatever
else was passed in.[7] Once we're executing inside the closure, the hash
reference inside $data
is again accessible. The
closure is then free to permit or deny access to whatever it
pleases.
No one outside the closure object has unmediated
access to this very private instance data, not even other methods in
the class. They could try to call the closure the way the methods
generated by the for
loop do, perhaps setting an
instance variable the class never heard of. But this approach is
easily blocked by inserting various bits of code in the constructor
where you see the comment about access checks. First, we need a
common preamble:
use Carp; local $Carp::CarpLevel = 1; # Keeps croak messages short my ($cpack, $cfile) = caller();
Now for each of the checks. The first one makes sure the specified attribute name exists:
croak "No valid field '$field' in object" unless exists $data->{$field};
This one allows access only by callers from the same file:
carp "Unmediated access denied to foreign file" unless $cfile eq __FILE__;
This one allows access only by callers from the same package:
carp "Unmediated access denied to foreign package ${cpack}::" unless $cpack eq __PACKAGE__;
And this one allows access only by callers whose classes inherit ours:
carp "Unmediated access denied to unfriendly class ${cpack}::" unless $cpack->isa(__PACKAGE__);
All these checks block unmediated access only. Users of the class who politely use the class's designated methods are under no such restriction. Perl gives you the tools to be just as persnickety as you want to be. Fortunately, not many people want to be.
But some people ought to be. Persnickety is good when
you're writing flight control software. If you either want or ought
to be one of those people, and you prefer using working code over
reinventing everything on your own, check out Damian Conway's
Tie::SecureHash
module on CPAN. It implements
restricted hashes with support for public, protected, and private
persnicketations. It also copes with the inheritance issues that
we've ignored in the previous example. Damian has also written an
even more ambitious module, Class::Contract
, that
imposes a formal software engineering regimen over Perl's flexible
object system. This module's feature list reads like a checklist
from a computer science professor's
software engineering textbook,[8] including enforced encapsulation, static inheritance,
and design-by-contract condition checking for object-oriented Perl,
along with a declarative syntax for attribute, method, constructor,
and destructor definitions at both the object and class level, and
preconditions, postconditions, and class invariants. Whew!
As of release 5.6 of Perl, you can also declare a method to indicate that it returns an lvalue. This is done with the lvalue subroutine attribute (not to be confused with object attributes). This experimental feature allows you to treat the method as something that would appear on the lefthand side of an equal sign:
package Critter; sub new { my $class = shift; my $self = { pups => 0, @_ }; # Override default. bless $self, $class; } sub pups : lvalue { # We'll assign to pups() later. my $self = shift; $self->{pups}; } package main; $varmint = Critter->new(pups => 4); $varmint->pups *= 2; # Assign to $varmint->pups! $varmint->pups =~ s/(.)/$1$1/; # Modify $varmint->pups in place! print $varmint->pups; # Now we have 88 pups.
This lets you pretend $varmint->pups
is
a variable while still obeying encapsulation. See Section 6.5.2 in Chapter 6.
If you're running a threaded version of Perl and want
to ensure that only one thread can call a particular method on an
object, you can use the locked
and
method
attributes to do that:
sub pups : locked method { … }
When any thread invokes the pups
method on
an object, Perl locks the object before execution, preventing other
threads from doing the same. See Section 6.5.1 in Chapter 6.