Software Development Strategies

Content

Software Development Strategies#

Rubber Ducking #

The “rubber ducking” technique comes from a simple idea: explain your code, line by line, to an inanimate object, traditionally a rubber duck on your desk. Verbalizing your logic forces you to slow down and articulate your reasoning. Without any feedback from the duck, this process alone often reveals logical inconsistencies or obvious errors you missed while reading silently.

While commonly associated with debugging, rubber ducking is equally useful before any code is written:

Planning: explain the problem and your intended solution to clarify requirements.
Designing: narrate your architectural decisions to spot structural flaws early.
Pseudocode: verbalize your algorithm while drafting it to ensure it flows naturally.
Testing: explain the reasoning behind each test case to ensure edge cases are covered.

Variation: Write down with pen on paper the pseudo code and logic.

Rubber ducking and LLMs#

Rubber ducking is also the secret to using Large Language Models effectively. Treat the prompt as your duck: by typing out a detailed, step-by-step explanation of your problem, you often solve it yourself. At the very least, you give the LLM exactly the context it needs for a useful answer.

A practical workflow#

There is no single correct way to rubber duck, but the following steps offer a structured approach:

Define the problem: Tell the duck what you are solving, including context and constraints. Write this definition into the docstring of your script.

"""Exploratory script to get the first view of the fight data.

This script loads the fight data from a CSV file into a pandas DataFrame,
finds all the unique names, and compiles the set of characters used in
the fighter names.
"""

Outline your approach: Talk through the steps. Write them as comments like a “recipe” for your code.

# Import needed packages
# Set the path to the fight_data.csv
# Read the CSV with pandas
# Get the column names and print them
# Get all unique fighter names and display the count

Write pseudocode: For each step, explain the specific logic. Expand your comments into detailed pseudocode.
Implement: Translate pseudocode into real code. Adjust the plan as needed. The goal is solving the problem, not following your initial sketch blindly.
Review: Walk through the finished script with the duck. Clean up comments: remove what the code already makes obvious, keep what explains why.

Functions & Classes#

Functions#

The primary mechanism for achieving reusable code is the function: a self-contained block of code that accepts inputs, processes them, and returns outputs.

Functions give you modularity (small isolated pieces), abstraction (you can use a function without knowing its internals), testability (verify each one in isolation), and make collaboration straightforward (different people build different functions).

A well-designed function follows a few rules:

Descriptive name: use a verb-noun pair that explains the action, e.g. calculate_mean rather than do_math.
Single responsibility: one function should do one thing. If it does three things, split it into three functions.
No hidden dependencies: pass all required data as arguments instead of relying on global variables.

Docstrings#

Every function should have a docstring. That’s a comment block placed at the top of its definition. A good docstring lets a user understand how to call the function without reading the implementation.

This course follows the NumPy/SciPy docstring style. A complete docstring includes a summary, parameter descriptions, return types, and usage examples:

def add_numbers(a: int, b: int) -> int:
    """Add two numbers together.

    Parameters
    ----------
    a
        The first number to add.
    b
        The second number to add.

    Returns
    -------
    int
        The sum of `a` and `b`.

    Examples
    -------
    >>> add_numbers(2, 3)
    5
    """
    return a + b

Classes#

For more advanced architectural requirements, the class serves as a versatile structural pattern. A class acts as a blueprint for creating objects (instances), allowing data (attributes) and behavior (methods) to be bundled together into a single logical entity.

Classes provide encapsulation (keeping data and its manipulating functions together), state management (maintaining internal data across multiple method calls), and inheritance (allowing new structures to be derived from existing ones).

A well-designed class adheres to specific principles:

Noun-based naming: Class names should be descriptive nouns utilizing PascalCase (e.g., DataProcessor).
High cohesion: A class should represent a single concept, with all its methods closely related to its internal state.
Low coupling: Dependencies on external classes or global states should be minimized to ensure the class can be tested and reused independently.

Object-Oriented (OO) Coding Pattern#

Object-Oriented Programming (OOP) is a paradigm organized around objects rather than discrete actions, and data rather than pure logic. Implementing OO patterns facilitates the management of complex state and behavior in large, reproducible codebases.

Core OO principles include:

Encapsulation: Internal states are hidden from the outside, preventing unintended interference. Interaction occurs only through well-defined public methods.
Inheritance: Code duplication is reduced by allowing new, specialized classes to inherit properties and methods from generalized parent classes.
Polymorphism: Different classes can be treated uniformly if they share a common interface, enhancing flexibility in code design.

Similar to functions, classes and their methods require comprehensive docstrings to guarantee readability and collaborative potential.

class DataProcessor:
    """A class used to process and scale numerical data.

    Parameters
    ----------
    scaling_factor : float
        The multiplier applied to the data during processing.

    Attributes
    ----------
    scaling_factor : float
        The stored scaling factor.
    """

    def __init__(self, scaling_factor: float):
        self.scaling_factor = scaling_factor

    def scale_value(self, value: float) -> float:
        """Scale a single numerical value.

        Parameters
        ----------
        value : float
            The input value to be scaled.

        Returns
        -------
        float
            The scaled value.
        """
        return value * self.scaling_factor

Unit Testing#

Unit testing verifies that individual components (functions, classes) behave exactly as intended. In the context of refactoring, tests act as your safety net: when you restructure or optimize a function, a test suite catches regressions immediately.

Three reasons to write tests:

Quality: tests verify correct behavior across a range of inputs, including edge cases.
Safe refactoring: if a test fails after cleanup, you know the behavior changed before that change reaches production.
Living documentation: tests show unambiguously how a function should be called and what it returns.

Setting up `pytest`#

pytest is the standard testing framework for Python. To integrate it into your project:

Create a tests/ directory in your project root, separate from your source code.
Install pytest: pip install pytest
Write tests in files named test_*.py. pytest discovers these automatically.

Here is a minimal example. Given a function in your source code:

# src/mypkgs/calculate.py
def calculate_area(length, width):
    """Calculate the area of a rectangle."""
    return length * width

Write a corresponding test:

# tests/test_calculate.py
from mypkgs.calculate import calculate_area

def test_calculate_area():
    assert calculate_area(2, 3) == 6
    assert calculate_area(0, 5) == 0
    assert calculate_area(-1, 5) == -5

Run tests from your project root:

python -m pytest

pytest reports which tests passed and which failed, with tracebacks for failures. Once your tests pass, you can refactor freely. Just rerun pytest after every change.