Home Artificial Intelligence Advanced Python: Dot Operator Getting the attributes Class functions Conclusion References

Advanced Python: Dot Operator Getting the attributes Class functions Conclusion References

0
Advanced Python: Dot Operator
Getting the attributes
Class functions
Conclusion
References

I’ll start with one trivial query: “What’s a “dot operator?”

Here is an example:

hello = 'Hello world!'

print(hello.upper())
# HELLO WORLD!

Well, this is unquestionably a “Hello World” example, but I can hardly imagine someone beginning to teach you Python exactly like this. Anyway, the “dot operator” is the “.” a part of thehello.upper(). Let’s try giving a more verbose example:

class Person:

num_of_persons = 0

def __init__(self, name):
self.name = name

def shout(self):
print(f"Hey! I'm {self.name}")

p = Person('John')
p.shout()
# Hey I'm John.

p.num_of_persons
# 0

p.name
# 'John'

There are a couple of places where you utilize the “dot operator”. To make it easier to see the larger picture, let’s summarize the way in which you utilize it in two cases:

  • use it to access attributes of an object or class,
  • use it to access functions defined in the category definition.

Obviously, we now have all of this in our example, and this seems intuitive and as expected. But there may be more to this than meets the attention! Take a better take a look at this instance:

p.shout
# >

id(p.shout)
# 4363645248

Person.shout
#

id(Person.shout)
# 4364388816

In some way, p.shout shouldn’t be referencing the identical function as Person.shout even though it should. At the least you’d expect it, right? And p.shout shouldn’t be even a function! Let’s go over the subsequent example before we start discussing what is occurring:

class Person:

num_of_persons = 0

def __init__(self, name):
self.name = name

def shout(self):
print(f"Hey! I'm {self.name}.")

p = Person('John')

vars(p)
# {'name': 'John'}

def shout_v2(self):
print("Hey, what's up?")

p.shout_v2 = shout_v2

vars(p)
# {'name': 'John', 'shout_v2': }

p.shout()
# Hey, I'm John.

p.shout_v2()
# TypeError: shout_v2() missing 1 required positional argument: 'self'

For those unaware of the vars function, it returns the dictionary that holds attributes of an instance. For those who run vars(Person) you’ll get a bit different response, but you’ll get the image. There can be each attributes with their values and variables that hold class function definitions. There may be obviously a difference between an object that’s an instance of a category and the category object itself, and so there can be a difference in vars function response for these two.

Now, it’s perfectly valid to moreover define a function after an object is created. That is the road p.shout_v2 = shout_v2. This does introduce one other key-value pair within the instance dictionary. Seemingly every thing is sweet, and we’ll have the opportunity to run easily, as if shout_v2 were laid out in the category definition. But alas! Something is actually flawed. We aren’t in a position to call it the identical way as we did the shout method.

Astute readers must have noticed by now how fastidiously I exploit the terms function and method. In spite of everything, there may be a difference in how Python prints these as well. Take a take a look at the previous examples. shout is a technique, shout_v2 is a function. At the least if we take a look at these from the angle of the article p. If we take a look at these from the angle of the Person class, shout is a function, and shout_v2 doesn’t exist. It’s defined only in the article’s dictionary (namespace). So in the event you are really going to depend on object-oriented paradigms and mechanisms like encapsulation, inheritance, abstraction, and polymorphism, you won’t define functions on objects, like p is in our example. You’ll make certain you’re defining functions in a category definition (body).

So why are these two different, and why will we get the error? Well, the fastest answer is due to how the “dot operator” works. The longer answer is that there’s a mechanism behind the scenes that does the (attribute) name resolution for you. This mechanism consists of __getattribute__ and __getattr__ dunder methods.

At first, this may probably sound unintuitive and fairly unnecessarily complicated, but bear with me. Essentially, there are two scenarios that may occur while you attempt to access an attribute of an object in Python: either there may be an attribute or there shouldn’t be. Simply. In each cases, __getattribute__ is known as, or to make it easier for you, it’s being called all the time. This method:

  • returns computed attribute value,
  • explicitly calls __getattr__, or
  • raises AttributeError during which case __getattr__ is known as by default.

If you would like to intercept the mechanism that resolves attribute names, that is the place to hijack. You simply must watch out, since it is de facto easy to find yourself in an infinite loop or to mess up the entire mechanism of name resolution, especially within the scenario of object-oriented inheritance. It shouldn’t be so simple as it could appear.

If you would like to handle cases where there isn’t any attribute in the article’s dictionary, you possibly can immediately implement the __getattr__ method. This one gets called when __getattribute__ fails to access the attribute name. If this method can’t find an attribute or take care of a missing one in any case, it raises an AttributeError exception as well. Here is how you possibly can mess around with these:

class Person:

num_of_persons = 0

def __init__(self, name):
self.name = name

def shout(self):
print(f"Hey! I'm {self.name}.")

def __getattribute__(self, name):
print(f'getting the attribute name: {name}')
return super().__getattribute__(name)

def __getattr__(self, name):
print(f'this attribute doesn't exist: {name}')
raise AttributeError()

p = Person('John')

p.name
# getting the attribute name: name
# 'John'

p.name1
# getting the attribute name: name1
# this attribute doesn't exist: name1
#
# ... exception stack trace
# AttributeError:

It is rather essential to call super().__getattribute__(...) in your implementation of __getattribute__, and the explanation, like I wrote earlier, is that there may be rather a lot occurring in Python’s default implementation. And this is precisely the place where “dot operator” gets its magic from. Well, at the very least half of the magic is there. The opposite part is in how a category object is created after interpreting the category definition.

The term I exploit here is purposeful. Class does contain only functions, and we saw this in certainly one of the primary examples:

p.shout
# >

Person.shout
#

When looking from the article’s perspective, these are called methods. The strategy of transforming the function of a category into a way of an object is known as bounding, and the result’s what you see within the previous example, a certain method. What makes it certain, and to what? Well, once you’ve got an instance of a category and begin calling its methods, you’re, in essence, passing the article reference to every of its methods. Remember the self argument? So, how does this occur, and who does it?

Well, the primary part happens when the category body is being interpreted. There are quite a couple of things that occur on this process, like defining a category namespace, adding attribute values to it, defining (class) functions, and binding them to their names. Now, as these functions are being defined, they’re being wrapped in a way. Wrapped in an object conceptually called descriptor. This descriptor is enabling this transformation within the identification and behavior of sophistication functions that we saw previously. I’ll make certain to jot down a separate blog post about descriptors, but for now, know that this object is an instance of a category that implements a predefined set of dunder methods. This can also be called a Protocol. Once these are implemented, it is alleged that objects of this class follow the precise protocol and due to this fact behave within the expected way. There may be a difference between the data and non-data descriptors. Former implements __get__, __set__, and/or __delete__ dunder methods. Later, implement only the __get__ method. Anyway, each function in a category finally ends up being wrapped in a so-called non-data descriptor.

When you initiate attribute lookup by utilizing the “dot operator”, the __getattribute__ method is known as, and the entire strategy of name resolution starts. This process stops when resolution is successful, and it goes something like this:

  1. return the info descriptor that has the specified name (class level), or
  2. return instance attribute with the specified name (instance level), or
  3. return non-data descriptor with the specified name (class level), or
  4. return class attribute with the specified name (class level), or
  5. raise AttributeError that essentially calls the __getattr__ method.

My initial idea was to go away you with a reference to the official documentation on how this mechanism is implemented, at the very least a Python mockup, for learning purposes, but I even have decided to aid you out with that part as well. Nevertheless, I highly advise you to go and skim the entire page of official documentation.

So, in the subsequent code snippet, I’ll put a few of the descriptions within the comments, so it is less complicated to read and understand the code. Here it’s:

def object_getattribute(obj, name):
"Emulate PyObject_GenericGetAttr() in Objects/object.c"
# Create vanilla object for later use.
null = object()

"""
obj is an object instantiated from our custom class. Here we try
to seek out the name of the category it was instantiated from.
"""
objtype = type(obj)

"""
name represents the name of the category function, instance attribute,
or any class attribute. Here, we try to seek out it and keep a
reference to it. MRO is brief for Method Resolution Order, and it
has to do with class inheritance. Probably not that essential at
this point. For example that this mechanism optimally finds name
through all parent classes.
"""
cls_var = find_name_in_mro(objtype, name, null)

"""
Here we check if this class attribute is an object that has the
__get__ method implemented. If it does, it's a non-data
descriptor. This is vital for further steps.
"""
descr_get = getattr(type(cls_var), '__get__', null)

"""
So now it's either our class attribute references a descriptor, in
which case we test to see whether it is a knowledge descriptor and we
return reference to the descriptor's __get__ method, or we go to
the subsequent if code block.
"""
if descr_get shouldn't be null:
if (hasattr(type(cls_var), '__set__')
or hasattr(type(cls_var), '__delete__')):
return descr_get(cls_var, obj, objtype) # data descriptor

"""
In cases where the name doesn't reference a knowledge descriptor, we
check to see if it references the variable in the article's
dictionary, and if that's the case, we return its value.
"""
if hasattr(obj, '__dict__') and name in vars(obj):
return vars(obj)[name] # instance variable

"""
In cases where the name doesn't reference the variable within the
object's dictionary, we attempt to see if it references a non-data
descriptor and return a reference to it.
"""
if descr_get shouldn't be null:
return descr_get(cls_var, obj, objtype) # non-data descriptor

"""
In case name didn't reference anything from above, we attempt to see
if it references a category attribute and return its value.
"""
if cls_var shouldn't be null:
return cls_var # class variable

"""
If name resolution was unsuccessful, we throw an AttriuteError
exception, and __getattr__ is being invoked.
"""
raise AttributeError(name)

Bear in mind that this implementation is in Python for the sake of documenting and describing the logic implemented within the __getattribute__ method. In point of fact, it’s implemented in C. Just by it, you possibly can imagine that it is healthier to not mess around with re-implementing the entire thing. One of the best ways is to try doing a part of the resolution by yourself after which fall back on the CPython implementation with return super().__getattribute__(name) as shown in the instance above.

The essential thing here is that every class function (which is an object) gets wrapped in a non-data descriptor (which is a function class object), and which means that this wrapper object has the __get__ dunder method defined. What this dunder method does is return a brand new callable (consider it as a brand new function), where the primary argument is the reference to the article on which we’re performing the “dot operator”. I said to give it some thought as a brand new function because it is a callable. In essence, it’s one other object called MethodType. Test it out:

type(p.shout)
# getting the attribute name: shout
# method

type(Person.shout)
# function

One interesting thing actually is that this function class. This one is precisely the wrapper object that defines the __get__ method. Nevertheless, once we attempt to access it as method shout by “dot operator”, __getattribute__ iterates through the list and stops on the third case (return non-data descriptor). This __get__ method accommodates additional logic that takes the article’s reference and creates MethodType close to the function and object.

Here is the official documentation mockup:

class Function:
...

def __get__(self, obj, objtype=None):
if obj is None:
return self
return MethodType(self, obj)

Disregard the difference at school name. I even have been using function as a substitute of Function to make it easier for grasping, but I’ll use the Function name any longer so it follows the official documentation explanation.

Anyway, just by this mockup, it could be enough to know how this function class suits the image, but let me add a few lines of code which are missing, which is able to probably make things even clearer. I’ll add two more class functions in this instance, namely:

class Function:
...

def __init__(self, fun, *args, **kwargs):
...
self.fun = fun

def __get__(self, obj, objtype=None):
if obj is None:
return self
return MethodType(self, obj)

def __call__(self, *args, **kwargs):
...
return self.fun(*args, **kwargs)

Why did I add these functions? Well, now you possibly can easily imagine how the Function object plays its role on this whole scenario of method bounding. This recent Function object stores the unique function as an attribute. This object can also be callable which implies that we will invoke it as a function. In that case, it really works just because the function it wraps. Remember, every thing in Python is an object, even functions. And MethodType ‘wraps’ Function object together with the reference to the article on which we’re calling method (in our case shout).

How does MethodType do that? Well, it keeps these references and implements a callable protocol. Here is the official documentation mockup for the MethodType class:

class MethodType:

def __init__(self, func, obj):
self.__func__ = func
self.__self__ = obj

def __call__(self, *args, **kwargs):
func = self.__func__
obj = self.__self__
return func(obj, *args, **kwargs)

Again, for brevity’s sake, func finally ends up referencing our initial class function (shout), obj references instance (p), after which we now have arguments and keyword arguments which are passed along. self within the shout declaration finally ends up referencing this ‘obj’, which is basically p in our example.

In the long run, it must be clear why we make a distinction between functions and methods and the way functions get certain once they’re accessed through objects by utilizing the “dot operator”. For those who give it some thought, we can be perfectly okay with invoking class functions in the next way:

class Person:

num_of_persons = 0

def __init__(self, name):
self.name = name

def shout(self):
print(f"Hey! I'm {self.name}.")

p = Person('John')

Person.shout(p)
# Hey! I'm John.

Yet, this really shouldn’t be the advised way and is just plain ugly. Normally, you won’t must do that in your code.

So, before I conclude, I would like to go over a few examples of attribute resolution simply to make this easier to know. Let’s use the previous example and determine how the dot operator works.

p.name
"""
1. __getattribute__ is invoked with p and "name" arguments.

2. objtype is Person.

3. descr_get is null since the Person class doesn't have
"name" in its dictionary (namespace).

4. Since there isn't any descr_get in any respect, we skip the primary if block.

5. "name" does exist in the article's dictionary so we get the worth.
"""

p.shout('Hey')
"""
Before we go into name resolution steps, be mindful that
Person.shout is an instance of a function class. Essentially, it gets
wrapped in it. And this object is callable, so you possibly can invoke it with
Person.shout(...). From a developer perspective, every thing works just
as if it were defined in the category body. But within the background, it
most actually shouldn't be.

1. __getattribute__ is invoked with p and "shout" arguments.

2. objtype is Person.

3. Person.shout is definitely wrapped and is a non-data descriptor.
So this wrapper does have the __get__ method implemented, and it
gets referenced by descr_get.

4. The wrapper object is a non-data descriptor, so the primary if block
is skipped.

5. "shout" doesn't exist in the article's dictionary since it is an element
of sophistication definition. Second if block is skipped.

6. "shout" is a non-data descriptor, and its __get__ method is returned
from the third if code block.

Now, here we tried accessing p.shout('Hey'), but what we did get is
p.shout.__get__ method. This one returns a MethodType object. Because
of this p.shout(...) works, but what finally ends up being called is an
instance of the MethodType class. This object is basically a wrapper
across the `Function` wrapper, and it holds reference to the `Function`
wrapper and our object p. In the long run, while you invoke p.shout('Hey'),
what finally ends up being invoked is `Function` wrapper with p object, and
'Hey' as certainly one of the positional arguments.
"""

Person.shout(p)
"""
Before we go into name resolution steps, be mindful that
Person.shout is an instance of a function class. Essentially, it gets
wrapped in it. And this object is callable, so you possibly can invoke it with
Person.shout(...). From a developer perspective, every thing works just
as if it were defined in the category body. But within the background, it
most actually shouldn't be.

This part is identical. The next steps are different. Check
it out.

1. __getattribute__ is invoked with Person and "shout" arguments.

2. objtype is a sort. This mechanism is described in my post on
metaclasses.

3. Person.shout is definitely wrapped and is a non-data descriptor,
so this wrapper does have the __get__ method implemented, and it
gets referenced by descr_get.

4. The wrapper object is a non-data descriptor, so first if block is
skipped.

5. "shout" does exist in an object's dictionary because Person is
object in any case. So the "shout" function is returned.

When Person.shout is invoked, what actually gets invoked is an instance
of the `Function` class, which can also be callable and wrapper across the
original function defined in the category body. This manner, the unique
function gets called with all positional and keyword arguments.
"""

If reading this text in a single go was not a simple endeavor, don’t fret! The entire mechanism behind the “dot operator” shouldn’t be something you grasp that easily. There are at the very least two reasons, one being how __getattribute__ does the name resolution, and the opposite being how class functions get wrapped upon class body interpretation. So, make certain you go over the article a few times and play with the examples. Experimenting is de facto what drove me to begin a series called Advanced Python.

Yet another thing! For those who like the way in which I explain things and there’s something advanced on the earth of Python that you desire to to examine, shout out!

LEAVE A REPLY

Please enter your comment!
Please enter your name here