python1-1

Understanding Underscores in Python

Ever wonder why sometimes attributes, methods in Python start with single underscore, double underscore(“dunder”), end with single underscore etc.

Single and double underscores have a meaning in Python variable and method names. Some of that meaning is merely by convention and intended as a hint to the programmer, and some of it is enforced by the Python interpreter

Single Leading Underscore: _var in a Class

When it comes to variable and method names, the single underscore prefix has a meaning by convention only. It’s a hint to the programmer that a variable or method starting with a single underscore is intended for internal use. This isn’t enforced by Python. Python does not have strong distinctions between “private” and “public” variables like Java does. Single leading underscore indicates isn’t really meant to be a part of the public interface of the class, so it should be left alone.

# Create a class Test with two attributes
class Test:
    def __init__(self):
        self.color = "red"
        self._num = 9 # internal attribute
        
# object of class Test
t = Test() 
print(t.color)
print(t._num)

Output:
-------
red
9

Since leading single underscore is by convention meant for internal only, so if anyone wants , one can access it. Python interpreter does not prevent us from “reaching into” the class and accessing the value of that variable. We can even set value of internal attribute _num.

t.color="blue"
t._num = 11

print(t.color, t._num);

Output:
-------
blue 11

Leading underscores do impact how names get imported from modules

If we use a wildcard import to import all names from the module, Python will not import names with a leading underscore (unless the module defines an all list that overrides this behavior) For Example, from M import * does not import objects whose name starts with an underscore.

So , Single underscores are a Python naming convention indicating a name is meant for internal use. It is generally not enforced by the Python interpreter and meant as a hint to the programmer only.

Single underscore use In Interpreter

Python automatically stores the value of the last expression in the interpreter to a particular variable called “_.” We can also assign its value to another variable if we want to.

>>> 8 + 8
16
>>> _
16
>>> 9 + 10
19
>>> _
19
>>> 20 * 3 + 10
70
>>> _
70
>>> result=_
>>> result
70

Underscore can be used to Ignore Values

Ignoring means assigning the values to special variable underscore(). We’re assigning the values to underscore() given that it will be not used in future code.

# lets say we want to assign first and last value of tuple to variables

a, *_, b = (11, 9, 3, 5, 7, 8)

print(a); # first value of tuple
print(b); # last value of tuple
print(_); # middle values of tuple which to be ignored

Output:
------
11
8
[9, 3, 5, 7]

Underscore used in looping

We can use underscore(_) as a variable in looping.

for _ in range(5):
    print(_)

Output:
-------
0
1
2
3
4

Separating Digits Of Numbers

If we have a long digits number, we can separate the group of digits as we like for better understanding.

million = 1_000_000
billion = 1_000_000_000

print(million);
print(billion);

Output:
--------
1000000
1000000000

Single Trailing Underscore: var_

Sometimes the most fitting name for a variable is already taken by a keyword. Therefore names like class or def cannot be used as variable names in Python. In this case you can append a single underscore to break the naming conflict. So single_trailing_underscore_: used by convention to avoid conflicts with Python

class Student:
    def __init__(self, name, class_):
        self.name = name
        self.class_ = class_
        
s = Student('Ram', 10)        
print(f"{s.name} studies in class {s.class_}");

Output
--------
Ram studies in class 10

Double Leading Underscore: __var

The naming patterns we covered so far received their meaning from agreed upon conventions only. With Python class attributes (variables and methods) that start with double underscores, things are a little different.

A double underscore prefix causes the Python interpreter to rewrite the attribute name in order to avoid naming conflicts in subclasses.

This is also called name mangling—the interpreter changes the name of the variable in a way that makes it harder to create collisions when the class is extended later.

Name Mangling:- interpreter of the Python alters the variable name in a way that it is challenging to clash when the class is inherited.

class Sample:
    def __init__(self):
        self.a = 1
        self._b = 2
        self.__c = 3
        

s = Sample()
print(s.a);
print(s._b);

Output:
-----
1
2

# If we try to access s.__c we would get error 
print(s.__c);

Error: AttributeError: 'Sample' object has no attribute '__c'

lets examine attributes of object s

dir(s) [0]

Output:
------
'_Sample__c'

Notice that attribute __c is converted to _Sample__C.
This is the name mangling that the Python interpreter applies. It does this to protect the variable from getting overridden in subclasses.

# So we can access it as 
print(s._Sample__c);

Output:
------
3

__double_leading_underscore: when naming a class attribute, invokes name mangling

## Lets create an child class of Sample 

class Sample_extended(Sample):
    def __init__(self):
        super().__init__()
        self.a = "overridden"
        self._b = "overridden"
        self.__c = "overridden"
        
        
se = Sample_extended()
print(se.a);
print(se._b);

Output:
-------
overridden
overridden

# lets check attributes of object se
dir(se)[0:2]

Output:
------
['_Sample__c', '_Sample_extended__c']

Here the name mangling works again.
Notice there are two attibutes created, one belongs to base (parent class) i.e _Sample__c,
and other belongs to child class i.e _Sample_extended__c

print(se._Sample__c);
print(se._Sample_extended__c);

Output;
------
3
overridden

So , attributes a, and _b get overridden , where as for __c, it creates seperate attributes based on class.

What’s a “dunder” in Python?

Double underscores are often referred to as “dunders” in the Python community. The reason is that double underscores appear quite often in Python code and to avoid fatiguing their jaw muscles Pythonistas often shorten “double underscore” to “dunder.”

For example, you’d pronounce baz as “dunder baz”. Likewise __init would be pronounced as “dunder init”, even though one might think it should be “dunder init dunder.” But that’s just yet another quirk in the naming convention.

Double Leading and Trailing Underscore: __var__

Perhaps surprisingly, name mangling is not applied if a name starts and ends with double underscores. Variables surrounded by a double underscore prefix and postfix are left unscathed by the Python interpeter

class PrePostUnderscoreTest:
    def __init__(self):
        self.__shape__ = "circle"

 PrePostUnderscoreTest().__shape__

Output:
-------
'circle'

However, names that have both leading and trailing double underscores are reserved for special use in the language.
Such methods like init(), which have leader and trailing dunders (i.e double underscored) are called as magic methods or dunder methods.

It’s best to stay away from using names that start and end with double underscores (“dunders”) in your own programs to avoid collisions with future changes to the Python language.

Python Underscore Naming Patterns – Summary

Single Leading Underscore i.e _var : Naming convention indicating a name is meant for internal use. Generally not enforced by the Python interpreter (except in wildcard imports) and meant as a hint to the programmer only.

Single Trailing Underscore i.e var_ : Used by convention to avoid naming conflicts with Python keywords.

Double Leading Underscore i.e __var : Triggers name mangling when used in a class context. Enforced by the Python interpreter.

Double Leading and Trailing Underscore i.e var: Indicates special methods defined by the Python language. Avoid this naming scheme for your own attributes.

Shekhar Pandey

Add comment