Micro-Benchmarking in Python: Measuring Performance with Custom Counters

Introduction

In software development, performance is a critical factor that can significantly impact user experience and system efficiency. To optimize code and identify performance bottlenecks, developers often use micro-benchmarking techniques to measure the execution time of specific code snippets or functions. In this blog, we will explore a micro-benchmarking library in Python that allows us to measure various aspects of code performance, including CPU Performance Monitoring Unit (PMU) counters, jemalloc memory allocations, and custom measurements like the number of hash collisions.

Overview of the Project

The micro-benchmarking library consists of three main functionalities, each implemented in separate modules:

PMU Counters Benchmarking: This module utilizes the perf library to measure CPU PMU counters. These counters provide valuable insights into the behavior of CPU caches, branch mispredictions, and other low-level performance metrics.

Memory Allocation Benchmarking: In this module, the pyjemalloc library is used to measure memory allocations with the jemalloc memory allocator. Jemalloc is known for its efficiency and scalability, making it a good candidate for memory allocation benchmarking.

Custom Measurement Benchmarking: The custom measurement module allows us to measure custom computations in the code. In this example, we demonstrate how to measure the number of hash collisions using a custom hashing algorithm.

How to Use the Micro-Benchmarking Library

To use the micro-benchmarking library, follow these steps:

Installation: Clone the repository and install the required dependencies using pip by running the following commands:


git clone <https://github.com/athy125/PyPerfMonitor.git>
cd PyPerfMonitor
python -m venv venv
source venv/bin/activate   # On Windows, use `venv\\Scripts\\activate`
pip install -r requirements.txt

PMU Counters Benchmarking: In the pmu_benchmark.py module, you can specify the events to measure and the function you want to benchmark. For example:


from pmu_benchmark import PMUBenchmark

def example_function():
    for i in range(100000):
        _ = i * i

if __name__ == "__main__":
    pmu_benchmark = PMUBenchmark(runs=5, number=1000)
    pmu_events = [
        "cpu/cache-misses/",
        "cpu/branch-misses/",
    ]
    pmu_benchmark.measure_pmu_counters(example_function, events=pmu_events)

Memory Allocation Benchmarking: In the memory_benchmark.py module, you can measure memory allocations with jemalloc by specifying the allocation size and the number of runs. For example:


from memory_benchmark import MemoryBenchmark

if __name__ == "__main__":
    memory_benchmark = MemoryBenchmark(runs=5, number=1000)
    allocation_size = 1024
    memory_benchmark.measure_memory_allocation(allocation_size)

Custom Measurement Benchmarking: In the custom_benchmark.py module, you can define custom computations. In this example, we measure the number of hash collisions using a simple hashing algorithm. For example:


def custom_hash_collision():
    collisions = 0
    hash_set = set()
    
    # Generate 10000 random strings and calculate their SHA-256 hash values
    for i in range(10000):
        data = ''.join(random.choices(string.ascii_letters + string.digits, k=10)).encode()
        hash_value = hashlib.sha256(data).hexdigest()
        
        # Check for hash collisions
        if hash_value in hash_set:
            collisions += 1
        hash_set.add(hash_value)

    print(f"Number of hash collisions: {collisions}")

Conclusion

Micro-benchmarking is a valuable technique for analyzing and optimizing code performance. The micro-benchmarking library we explored in this blog allows us to measure various aspects of code performance, including PMU counters, memory allocations, and custom computations. By using this library, developers can gain deeper insights into their code's performance characteristics and identify areas for improvement.

If you're interested in diving deeper into performance optimization, consider exploring more complex benchmarking scenarios and utilizing specialized tools for profiling and analyzing code execution. Happy benchmarking!