The various meanings and naming conventions around single and double underscores (âdunderâ) in Python, how name mangling works and how it affects your own Python classes. Single and double underscores have a meaning in Python variable and method names. Some of that meaning is merely by convention and intended as a hint to the programmerâand some of it is enforced by the Python interpreter. If youâre wondering âWhatâs the meaning of single and double underscores in Python variable and method names?â Iâll do my best to get you the answer here. In this article Iâll discuss the following five underscore patterns and naming conventions and how they affect the behavior of your Python programs:
At the end of the article youâll also find a brief âcheat sheetâ summary of the five different underscore naming conventions and their meaning, as well as a short video tutorial that gives you a hands-on demo of their behavior. Letâs dive right in! When it comes to variable and method names, the single underscore prefix has a meaning by convention only. Itâs a hint to the programmerâand it means what the Python community agrees it should mean, but it does not affect the behavior of your programs. The underscore prefix is meant as a hint to another programmer that a variable or method starting with a single underscore is intended for internal use. This convention is defined in PEP 8. This isnât enforced by Python. Python does not have strong distinctions between âprivateâ and âpublicâ variables like Java does. Itâs like someone put up a tiny underscore warning sign that says:
Take a look at the following example: class Test: def __init__(self): self.foo = 11 self._bar = 23 Whatâs going to happen if you instantiate this class and try to access the foo and _bar attributes defined in its __init__ constructor? Letâs find out: >>> t = Test() >>> t.foo 11 >>> t._bar 23 You just saw that the leading single underscore in _bar did not prevent us from âreaching intoâ the class and accessing the value of that variable. Thatâs because the single underscore prefix in Python is merely an agreed upon conventionâat least when it comes to variable and method names. However, leading underscores do impact how names get imported from modules. Imagine you had the following code in a module called my_module: # This is my_module.py: def external_func(): return 23 def _internal_func(): return 42 Now if you use a wildcard import to import all names from the module, Python will not import names with a leading underscore (unless the module defines an __all__ list that overrides this behavior): >>> from my_module import * >>> external_func() 23 >>> _internal_func() NameError: "name '_internal_func' is not defined" By the way, wildcard imports should be avoided as they make it unclear which names are present in the namespace. Itâs better to stick to regular imports for the sake of clarity. Unlike wildcard imports, regular imports are not affected by the leading single underscore naming convention: >>> import my_module >>> my_module.external_func() 23 >>> my_module._internal_func() 42 I know this might be a little confusing at this point. If you stick to the PEP 8 recommendation that wildcard imports should be avoided, then really all you need to remember is this:
Sometimes the most fitting name for a variable is already taken by a keyword. Therefore names like class or def cannot be used as variable names in Python. In this case you can append a single underscore to break the naming conflict: >>> def make_object(name, class): SyntaxError: "invalid syntax" >>> def make_object(name, class_): ... pass In summary, a single trailing underscore (postfix) is used by convention to avoid naming conflicts with Python keywords. This convention is explained in PEP 8. The naming patterns we covered so far received their meaning from agreed upon conventions only. With Python class attributes (variables and methods) that start with double underscores, things are a little different. A double underscore prefix causes the Python interpreter to rewrite the attribute name in order to avoid naming conflicts in subclasses. This is also called name manglingâthe interpreter changes the name of the variable in a way that makes it harder to create collisions when the class is extended later. I know this sounds rather abstract. This is why I put together this little code example we can use for experimentation: class Test: def __init__(self): self.foo = 11 self._bar = 23 self.__baz = 23 Letâs take a look at the attributes on this object using the built-in dir() function: >>> t = Test() >>> dir(t) ['_Test__baz', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_bar', 'foo'] This gives us a list with the objectâs attributes. Letâs take this list and look for our original variable names foo, _bar, and __bazâI promise youâll notice some interesting changes.
So what happened to __baz? If you look closely youâll see thereâs an attribute called _Test__baz on this object. This is the name mangling that the Python interpreter applies. It does this to protect the variable from getting overridden in subclasses. Letâs create another class that extends the Test class and attempts to override its existing attributes added in the constructor: class ExtendedTest(Test): def __init__(self): super().__init__() self.foo = 'overridden' self._bar = 'overridden' self.__baz = 'overridden' Now what do you think the values of foo, _bar, and __baz will be on instances of this ExtendedTest class? Letâs take a look: >>> t2 = ExtendedTest() >>> t2.foo 'overridden' >>> t2._bar 'overridden' >>> t2.__baz AttributeError: "'ExtendedTest' object has no attribute '__baz'" Wait, why did we get that AttributeError when we tried to inspect the value of t2.__baz? Name mangling strikes again! It turns out this object doesnât even have a __baz attribute: >>> dir(t2) ['_ExtendedTest__baz', '_Test__baz', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_bar', 'foo', 'get_vars'] As you can see __baz got turned into _ExtendedTest__baz to prevent accidental modification: >>> t2._ExtendedTest__baz 'overridden' But the original _Test__baz is also still around: Double underscore name mangling is fully transparent to the programmer. Take a look at the following example that will confirm this: class ManglingTest: def __init__(self): self.__mangled = 'hello' def get_mangled(self): return self.__mangled >>> ManglingTest().get_mangled() 'hello' >>> ManglingTest().__mangled AttributeError: "'ManglingTest' object has no attribute '__mangled'" Does name mangling also apply to method names? It sure doesâname mangling affects all names that start with two underscore characters (âdundersâ) in a class context: class MangledMethod: def __method(self): return 42 def call_it(self): return self.__method() >>> MangledMethod().__method() AttributeError: "'MangledMethod' object has no attribute '__method'" >>> MangledMethod().call_it() 42 Hereâs another, perhaps surprising, example of name mangling in action: _MangledGlobal__mangled = 23 class MangledGlobal: def test(self): return __mangled >>> MangledGlobal().test() 23 In this example I declared a global variable called _MangledGlobal__mangled. Then I accessed the variable inside the context of a class named MangledGlobal. Because of name mangling I was able to reference the _MangledGlobal__mangled global variable as just __mangled inside the test() method on the class. The Python interpreter automatically expanded the name __mangled to _MangledGlobal__mangled because it begins with two underscore characters. This demonstrated that name mangling isnât tied to class attributes specifically. It applies to any name starting with two underscore characters used in a class context. Now this was a lot of stuff to absorb. To be honest with you I didnât write these examples and explanations down off the top of my head. It took me some research and editing to do it. Iâve been using Python for years but rules and special cases like that arenât constantly on my mind. Sometimes the most important skills for a programmer are âpattern recognitionâ and knowing where to look things up. If you feel a little overwhelmed at this point, donât worry. Take your time and play with some of the examples in this article. Make these concepts sink in enough so that youâll recognize the general idea of name mangling and some of the other behaviors I showed you. If you encounter them âin the wildâ one day, youâll know what to look for in the documentation. Perhaps surprisingly, name mangling is not applied if a name starts and ends with double underscores. Variables surrounded by a double underscore prefix and postfix are left unscathed by the Python interpeter: class PrefixPostfixTest: def __init__(self): self.__bam__ = 42 >>> PrefixPostfixTest().__bam__ 42 However, names that have both leading and trailing double underscores are reserved for special use in the language. This rule covers things like __init__ for object constructors, or __call__ to make an object callable. These dunder methods are often referred to as magic methodsâbut many people in the Python community, including myself, donât like that. Itâs best to stay away from using names that start and end with double underscores (âdundersâ) in your own programs to avoid collisions with future changes to the Python language. Per convention, a single standalone underscore is sometimes used as a name to indicate that a variable is temporary or insignificant. For example, in the following loop we donât need access to the running index and we can use â_â to indicate that it is just a temporary value: >>> for _ in range(32): ... print('Hello, World.') You can also use single underscores in unpacking expressions as a âdonât careâ variable to ignore particular values. Again, this meaning is âper conventionâ only and thereâs no special behavior triggered in the Python interpreter. The single underscore is simply a valid variable name thatâs sometimes used for this purpose. In the following code example Iâm unpacking a car tuple into separate variables but Iâm only interested in the values for color and mileage. However, in order for the unpacking expression to succeed I need to assign all values contained in the tuple to variables. Thatâs where â_â is useful as a placeholder variable: >>> car = ('red', 'auto', 12, 3812.4) >>> color, _, _, mileage = car >>> color 'red' >>> mileage 3812.4 >>> _ 12 Besides its use as a temporary variable, â_â is a special variable in most Python REPLs that represents the result of the last expression evaluated by the interpreter. This is handy if youâre working in an interpreter session and youâd like to access the result of a previous calculation. Or if youâre constructing objects on the fly and want to interact with them without assigning them a name first: >>> 20 + 3 23 >>> _ 23 >>> print(_) 23 >>> list() [] >>> _.append(1) >>> _.append(2) >>> _.append(3) >>> _ [1, 2, 3] Hereâs a quick summary or âcheat sheetâ of what the five underscore patterns I covered in this article mean in Python:
|