Core Design Principles — Modularity, SoC, Coupling & Cohesion - Software Engineering Fundamentals

Why Principles, Not Rules

Software design principles are not rules to follow blindly. They are heuristics — guidelines that point you toward better designs most of the time, but that require judgment to apply correctly. The moment you treat a principle as an absolute rule, you end up with over-engineered, dogmatic code that’s harder to maintain than the “unprincipled” code it replaced.

Think of it like engineering judgment. A structural engineer knows that reducing deflection is generally good, but sometimes you want deflection (seismic isolation, for example). The principle “minimize deflection” is a starting point for reasoning, not a commandment.

The four principles in this lesson — Modularity, Separation of Concerns, Loose Coupling, and Tight Cohesion — are the foundation of every good software design. They appear at every level: function design, module structure, system architecture, and organizational design. Master them, and you’ll see them everywhere.

Modularity

Modularity means dividing a system into distinct, self-contained units (modules) that can be developed, tested, and understood independently. Each module has a clear responsibility and communicates with other modules through well-defined interfaces.

The engineering parallel: prefabrication. A modular building uses standardized components manufactured off-site. Each component is tested independently, interfaces are standardized (bolt patterns, connection details), and components can be replaced without demolishing the building.

Non-Modular Code

# Everything in one function — not modular
def process_sensor_data(filepath):
    # Read file
    with open(filepath) as f:
        raw = f.readlines()

    # Parse data
    data = []
    for line in raw:
        parts = line.strip().split(",")
        timestamp = float(parts[0])
        value = float(parts[1])
        if value < 0 or value > 1000:
            print(f"Warning: suspicious value {value}")
            continue
        data.append((timestamp, value))

    # Compute statistics
    values = [d[1] for d in data]
    mean = sum(values) / len(values)
    variance = sum((v - mean) ** 2 for v in values) / len(values)

    # Generate report
    report = f"Sensor Report\n"
    report += f"Points: {len(values)}\n"
    report += f"Mean: {mean:.2f}\n"
    report += f"Variance: {variance:.2f}\n"

    with open("report.txt", "w") as f:
        f.write(report)

    return mean, variance

Modular Code

# Separated into focused modules

def read_sensor_file(filepath):
    """Read raw sensor data from a CSV file."""
    with open(filepath) as f:
        return f.readlines()


def parse_sensor_data(raw_lines):
    """Parse raw lines into validated (timestamp, value) tuples."""
    data = []
    for line in raw_lines:
        parts = line.strip().split(",")
        timestamp = float(parts[0])
        value = float(parts[1])
        if value < 0 or value > 1000:
            continue
        data.append((timestamp, value))
    return data


def compute_statistics(data):
    """Compute mean and variance from sensor data."""
    values = [d[1] for d in data]
    mean = sum(values) / len(values)
    variance = sum((v - mean) ** 2 for v in values) / len(values)
    return {"mean": mean, "variance": variance, "count": len(values)}


def generate_report(stats, output_path):
    """Write a formatted report from computed statistics."""
    report = f"Sensor Report\n"
    report += f"Points: {stats['count']}\n"
    report += f"Mean: {stats['mean']:.2f}\n"
    report += f"Variance: {stats['variance']:.2f}\n"
    with open(output_path, "w") as f:
        f.write(report)

The modular version has clear benefits: each function can be tested independently, the parser can be reused for different file formats, the statistics module can process data from any source, and the report generator can be swapped for a different format without touching the analysis logic.

Separation of Concerns (SoC)

Separation of Concerns means each module, class, or function should address a single concern — a distinct aspect of the system’s functionality. SoC is related to modularity but goes deeper: it’s about what you separate, not just that you separate.

Common concerns that should be separated:

Data access (reading from files/databases) vs. business logic (processing and computation)
Input validation vs. core processing
Presentation (how results are displayed) vs. logic (how results are computed)
Configuration vs. execution
Error handling vs. normal flow

SoC Violation

# Concerns are mixed: database, computation, and formatting
def get_beam_capacity(beam_id):
    import sqlite3
    conn = sqlite3.connect("structures.db")
    cursor = conn.cursor()
    cursor.execute(
        "SELECT width, height, fy FROM beams WHERE id = ?", (beam_id,)
    )
    row = cursor.fetchone()
    conn.close()

    width, height, fy = row
    area = width * height
    capacity = area * fy * 0.9  # phi factor

    print(f"Beam {beam_id}: {capacity:.1f} kN")
    return capacity

Concerns Separated

# Data access concern
def fetch_beam_properties(beam_id, db_path="structures.db"):
    """Retrieve beam properties from the database."""
    import sqlite3
    conn = sqlite3.connect(db_path)
    cursor = conn.cursor()
    cursor.execute(
        "SELECT width, height, fy FROM beams WHERE id = ?", (beam_id,)
    )
    row = cursor.fetchone()
    conn.close()
    return {"width": row[0], "height": row[1], "fy": row[2]}


# Computation concern
def calculate_capacity(width, height, fy, phi=0.9):
    """Calculate beam capacity using basic area method."""
    area = width * height
    return area * fy * phi


# Presentation concern
def format_capacity_result(beam_id, capacity):
    """Format capacity result for display."""
    return f"Beam {beam_id}: {capacity:.1f} kN"

Now you can test the capacity calculation without a database. You can change the database to a REST API without touching the calculation. You can change the output format without touching either.

Loose Coupling

Coupling measures how much one module depends on the internal details of another. Tight coupling means changes in module A force changes in module B. Loose coupling means modules interact through well-defined interfaces, and internal changes don’t ripple outward.

Tight Coupling

# Report generator is tightly coupled to Solver's internals
class Solver:
    def __init__(self):
        self._results = {}
        self._mesh = None
        self._converged = False

    def solve(self, model):
        self._mesh = self._generate_mesh(model)
        self._results = self._run_analysis(self._mesh)
        self._converged = True


class ReportGenerator:
    def generate(self, solver):
        # Directly accesses Solver's internal state
        if not solver._converged:
            raise ValueError("Solver has not converged")

        mesh_size = len(solver._mesh.elements)
        max_stress = max(solver._results["stress"])
        # ... builds report from internal data

Loose Coupling

# Report generator depends only on a results interface
class Solver:
    def __init__(self):
        self._results = {}
        self._mesh = None
        self._converged = False

    def solve(self, model):
        self._mesh = self._generate_mesh(model)
        self._results = self._run_analysis(self._mesh)
        self._converged = True

    def get_results(self):
        """Public interface — returns a results summary."""
        if not self._converged:
            raise ValueError("Solver has not converged")
        return {
            "mesh_size": len(self._mesh.elements),
            "max_stress": max(self._results["stress"]),
            "converged": self._converged,
        }


class ReportGenerator:
    def generate(self, results):
        # Depends only on the results dictionary, not on Solver
        mesh_size = results["mesh_size"]
        max_stress = results["max_stress"]
        # ... builds report from public data

In the loosely coupled version, ReportGenerator doesn’t know or care that a Solver exists. It takes a results dictionary. You could replace the solver with a completely different analysis engine, and the report generator wouldn’t need a single line changed.

Key insight: Coupling is measured by the question: “If I change module A’s internals, how many other modules break?” If the answer is “none,” you have loose coupling. If the answer is “most of them,” you have tight coupling.

Tight Cohesion

Cohesion measures how closely the elements within a module are related to each other. High cohesion means everything in a module serves a single, focused purpose. Low cohesion means a module is a grab bag of unrelated functionality.

Low Cohesion

# This module does too many unrelated things
class ProjectUtilities:
    def parse_ifc_file(self, filepath):
        # ... IFC parsing logic

    def send_email_notification(self, recipient, subject, body):
        # ... email sending logic

    def calculate_wind_load(self, height, exposure_category):
        # ... structural engineering calculation

    def format_currency(self, amount, locale="en_US"):
        # ... currency formatting

    def compress_backup(self, directory, output_path):
        # ... file compression logic

High Cohesion

# Each class has a focused, coherent responsibility
class IFCParser:
    def parse(self, filepath):
        # ... IFC parsing logic

    def validate(self, model):
        # ... IFC validation logic

    def extract_elements(self, model, element_type):
        # ... element extraction logic


class WindLoadCalculator:
    def __init__(self, code="ASCE7-22"):
        self.code = code

    def calculate_pressure(self, height, exposure_category):
        # ... wind pressure calculation

    def calculate_force(self, pressure, tributary_area):
        # ... wind force calculation

    def get_exposure_factor(self, exposure_category, height):
        # ... exposure factor lookup

The high-cohesion version groups related functionality together. If you need to change how IFC files are parsed, you know exactly where to look. If you need to update the wind load calculation to a new code edition, all the relevant logic is in one place.

Tip: The coupling/cohesion relationship is inverse: increasing cohesion within modules tends to decrease coupling between modules. When each module does one thing well, it needs less from other modules.

SOLID Principles: A Brief Map

The SOLID principles are five object-oriented design principles that formalize many of the ideas above. We won’t deep-dive into each one here — they deserve their own lesson — but here’s a map so you recognize them:

Principle	Name	Core Idea
S	Single Responsibility	A class should have only one reason to change (high cohesion)
O	Open/Closed	Open for extension, closed for modification (add behavior without changing existing code)
L	Liskov Substitution	Subtypes must be substitutable for their base types without breaking the program
I	Interface Segregation	Clients should not depend on interfaces they don’t use (small, focused interfaces)
D	Dependency Inversion	Depend on abstractions, not concrete implementations (loose coupling)

Notice how SOLID maps directly to the principles we’ve already discussed. S is cohesion. O is about designing for change. I is about SoC at the interface level. D is about loose coupling. These aren’t new ideas — they’re formalized expressions of the same fundamental thinking.

Exercise 4.1: Refactoring Analysis

Exercise: Read the following function and identify which design principles it violates. Then describe (in words, not code) how you would refactor it.

def run_analysis(project_path):
    # Read config
    import json
    with open(f"{project_path}/config.json") as f:
        config = json.load(f)

    # Connect to database
    import psycopg2
    conn = psycopg2.connect(
        host=config["db_host"],
        dbname=config["db_name"],
        user=config["db_user"],
        password=config["db_password"],
    )
    cursor = conn.cursor()

    # Load model data
    cursor.execute("SELECT * FROM models WHERE project = %s", (config["project_id"],))
    models = cursor.fetchall()

    # Run analysis on each model
    results = []
    for model in models:
        nodes = model[3]  # magic index
        elements = model[4]  # magic index
        material = model[5]  # magic index

        # Inline solver
        stiffness = build_stiffness_matrix(nodes, elements, material)
        forces = apply_loads(config["load_cases"])
        displacements = solve(stiffness, forces)
        stresses = compute_stresses(displacements, elements, material)

        # Check code compliance
        max_stress = max(stresses)
        if material == "steel":
            limit = 250  # MPa, hardcoded
        elif material == "concrete":
            limit = 30  # MPa, hardcoded
        ratio = max_stress / limit

        # Store results
        cursor.execute(
            "INSERT INTO results (model_id, max_stress, ratio) VALUES (%s, %s, %s)",
            (model[0], max_stress, ratio),
        )

        # Send notification
        import smtplib
        server = smtplib.SMTP("smtp.company.com")
        server.sendmail(
            "analysis@company.com",
            config["notify_email"],
            f"Model {model[0]} complete. Ratio: {ratio:.2f}",
        )

        results.append({"model_id": model[0], "ratio": ratio})

    conn.commit()
    conn.close()
    return results

For each violation you identify, state:

Which principle is violated (Modularity, SoC, Coupling, Cohesion)
Where in the code the violation occurs
What harm it causes (why does this matter in practice?)
How you would fix it

Quiz

Question: A module called DataProcessor reads CSV files, validates data ranges, computes moving averages, generates HTML charts, sends email alerts when thresholds are exceeded, and logs all operations to a file. Which of the following best describes its design problem?

It has tight coupling because it depends on external libraries.
It has low cohesion because it handles multiple unrelated concerns in a single module.
It violates modularity because it uses too many functions.
It has too much abstraction and needs to be simplified into fewer layers.

Answer

b) It has low cohesion because it handles multiple unrelated concerns in a single module.

The DataProcessor module handles at least five distinct concerns: file I/O, validation, computation, visualization, notification, and logging. These are unrelated — changing how charts are generated has nothing to do with how emails are sent. This is a classic low-cohesion “God module” that tries to do everything. The fix is to separate each concern into its own module: a CSVReader, a Validator, a StatisticsEngine, a ChartGenerator, an AlertService, and a Logger. Option (a) confuses coupling with dependency — depending on external libraries is not tight coupling if you use them through well-defined interfaces.