Python: Difference between revisions

From Elvanör's Technical Wiki
Jump to navigation Jump to search
No edit summary
Line 49: Line 49:
will not work as expected if myAttribute was not assigned. One easy solution is to define in the class initializer method (__init__) the attribute and set it equal to None.
will not work as expected if myAttribute was not assigned. One easy solution is to define in the class initializer method (__init__) the attribute and set it equal to None.


= Profiling =
= Profiling & Memory =
 
== Sizes ==
 
* A Python float uses 24 bytes, and you need an additional 8 bytes to hold a pointer to the object in a List for instance (there are no primitive types in Python, everything is an object). So that's 32 bytes per float in a List.
* If you want to optimize RAM, you should consider using numpy or other libraries to hold large amounts of floats.
 
== Basic profiling techniques ==
 
* Be careful about using tracemalloc; it needs to be initialized and will impact memory usage by a great deal (so for instance RSS results are no longer significant). Never run tracemalloc in production.
* A simple memory usage technique is to just use the following code. It will give you the memory used by your process.
 
import psutil
memoryInfo = psutil.Process().memory_info()
rss = memoryInfo.rss / 1024 ** 2
 
* Note that when using CPython, the interpreter does not always release memory it allocated to the OS. It allocates memory in chunks (called arenas) and if only part of an arena is used, the whole arena is never released. However, an arena is typically 256K and freed space in the arena will be reused by CPython so that's usually not an issue.


== cProfile ==
== cProfile ==

Revision as of 09:57, 21 August 2023

Important changes from Java

  • When you pass an object to a function, and "reassign" it using its local function argument name, the outside objects won't get reassigned. This is because Python reassigns the local name only. This is a important difference from Java.

Useful techniques

  • To iterate over two lists at the same time, use the zip built-in function:
for a, b in zip(list1, list2):
  • To use the condition ? a : b construct:
a if condition else b

Casts

  • To cast a float to an integer, you can use the built-in int() function.

Exceptions

  • Sample code to deal with an exception:
	try:
	    doSomething()
	except Exception, inst:
	    print str(inst.args)
	    print str(sys.exc_info()[0])

This will give you information about the raised exception type.

Scopes

Within a module, inside a function, you can access the module variables normally. However, assigning them is not possible (you would assign a local variable). To assign a module variable inside a function, you need to specify that the variable is global by using the global keyword.

global myVariable

This is quite strange!

Classes

  • You don't need to explicitly declare class fields, as they are dynamically created the first time you assign them.
  • Every class method should take as a first argument "self".
  • Accessing an object attribute, when the attribute does not exist, results in an exception. Code like:
if object.myAttribute:

will not work as expected if myAttribute was not assigned. One easy solution is to define in the class initializer method (__init__) the attribute and set it equal to None.

Profiling & Memory

Sizes

  • A Python float uses 24 bytes, and you need an additional 8 bytes to hold a pointer to the object in a List for instance (there are no primitive types in Python, everything is an object). So that's 32 bytes per float in a List.
  • If you want to optimize RAM, you should consider using numpy or other libraries to hold large amounts of floats.

Basic profiling techniques

  • Be careful about using tracemalloc; it needs to be initialized and will impact memory usage by a great deal (so for instance RSS results are no longer significant). Never run tracemalloc in production.
  • A simple memory usage technique is to just use the following code. It will give you the memory used by your process.
import psutil
memoryInfo = psutil.Process().memory_info()
rss = memoryInfo.rss / 1024 ** 2
  • Note that when using CPython, the interpreter does not always release memory it allocated to the OS. It allocates memory in chunks (called arenas) and if only part of an arena is used, the whole arena is never released. However, an arena is typically 256K and freed space in the arena will be reused by CPython so that's usually not an issue.

cProfile

  • This profiler is built-in to Python. It produces profiling data related to the CPU usage / time spent in each method, and can thus be useful to locate performance bottlenecks.
  • You can use snakeviz as a GUI to better analyze the results produced by cProfile.
  • To start profiling, use:
profiler = cProfile.Profile()
profiler.enable()
  • Note that any method printing or exporting profiling data (like stats.dump_stats()) will "pause" the profiler. You will need to manually call profiler.enable() again, and then data collection will resume (aggregating into the previous profiler object, it does not reset statistics). This is a bit strange and not documented.