Page cover

Python Offensive Security 101: Python Fundamentals

Notes based on the 'Introduction to Python for Offensive Security' course by Red Team Leaders (https://courses.redteamleaders.com/courses/ce82d160-2ef7-4f2c-8b15-e5ea742b1877 ).

Data Types

Summary

Arbitrary-Precision Integers int

Python integers have unlimited precision. No 32/64-bit limits.

Common Security Uses:

  • Addresses, offsets, opcode fields

  • Encoding binary values (IP, ports)

  • Cryptographic math (RSA, ECC)

  • Payload and packet manipulation

Advantages:

  • Native support for big numbers

Pitfalls:

  • You must know how many bytes to allocate in .to_bytes()

  • Large values may crash when packed into small byte fields


EXAMPLE:

n = 0x1337
n.bit_length()              # β†’ 13
n.to_bytes(2, 'big')        # β†’ b'\x13\x37'
int.from_bytes(b'\x13\x37', 'big')  # β†’ 4919

Quick Function Reference Table

Function
Data Type
Use Case

int(x, base)

int

Convert string to int (e.g., hex)

int.from_bytes(b, endian)

int ↔ bytes

Bytes β†’ Integer

n.to_bytes(n_bytes, endian)

int ↔ bytes

Integer β†’ Bytes

hex(n)

int

Int β†’ hex string

bytes.fromhex(s)

bytes

Hex string β†’ Bytes

b.hex()

bytes

Bytes β†’ Hex string

bytearray(b)

bytearray

Mutable byte buffer

struct.pack(fmt, ...)

struct

Pack data to binary format

struct.unpack(fmt, b)

struct

Unpack binary to values

isclose(a, b)

float

Approximate float comparison

Decimal("0.1")

decimal

Exact decimal arithmetic

Fraction(1, 3)

fraction

Rational numbers

s.find(sub)

str

Find substring

s.replace(a, b)

str

Replace substrings

" ".join(list)

str

Join list of strings

binascii.hexlify(b)

bytes

Manual hex encoding

base64.b64encode(b)

bytes

Encode payload in base64

hashlib.sha256(b).hexdigest()

hashlib

SHA-256 hash for binary

Mini CheatSheet Snippets

Data Structures

Summary

Lists list

Lists = mutable, ordered containers, great for short-lived target batches, scan results, and small queues. Don’t use them as huge persistent stores.

Basic Ops

Filtering & Transforming

Stacks & Queues

Use deque instead of list.pop(0) for performance.

Slicing & Copying

Performance & Memory

  • Each slot stores a pointer (~8 bytes).

  • Use array, numpy, or generators for large numeric data.

  • list.clear() reuses memory instead of reallocating.

Security & Concurrency

  • Not thread-safe β†’ use queue.Queue for parallel scans.

  • Don’t keep secrets in lists. Clear (clear() / del) sensitive data after use.

  • Serialize results carefully (avoid leaking credentials or tokens).


EXAMPLE:

Characteristics Summary

Characteristic
list
tuple
dict
set

Mutability

Yes

No

Yes

Yes

Ordering

Yes

Yes

Yes

No

Allows Duplicates

Yes

Yes

Keys unique, values can duplicate

No

Hashable

No

Yes

Keys only

Yes

Typical Use (offsec)

Target batches, scan results, stacks/queues

IP/port pairs, coordinates, read-only constants

Map service→port, results keyed by host/IP

Dedup targets, fast membership checks, visited nodes

Control Flow

Boolean Contexts

Any object can be tested for truth:

  • False: False, None, 0, 0.0, "", [], {}, set()

  • True: everything else

Conditional Expressions

Loops

while

  • Evaluates before each iteration

  • while ... else executes else if loop exits normally (no break)

for

  • Iterates over any iterable

  • Supports enumerate, zip, unpacking

  • for ... else executes else if loop completes normally

Loop Control

Staement
Effect

break

Exit innermost loop immediately

continue

Skip rest of iteration, continue loop

pass

Do nothing (placeholder)

Comprehensions

Build lists, sets, dicts, generators concisely

Own local scope; avoids leaking loop vars

Pattern Matching (match ... case) 3.10+

Multi-way branch with destructuring

Functions

Definition

In Python, functions are defined with def followed by a name. The docstring describes what the function does and can be viewed using help(). The return statement sends a value back; if not used, the function returns None.

Argument Handling

Python uses pass-by-object-reference, meaning that if a function modifies a mutable object, the change is reflected outside the function. Reassigning the parameter does nothing; only in‑place mutations affect the caller.

Types of Calls

  • Positional: good for simple, fast operations (e.g., quick scan routines).

  • Keyword: clearer when optional parameters matter.

  • Mixed: combines both styles.

Default Argument Values

Default values are evaluated once, when the function is defined. Using mutable defaults can cause unexpected shared state between calls.

Bad example:

Good example:

Variable-Length Parameters

  • *args collects extra positional arguments (e.g., multiple hosts).

  • **kwargs collects keyword metadata (e.g., tool name, scan type).

Calling with unpacking:

Positional-Only and Keyword-Only Parameters

  • Positional-only helps avoid accidental keyword usage.

  • Keyword-only ensures clarity for functions that require explicit naming.

Advanced Functions

Type Annotations and Type Hints

Annotations document expected types but do not enforce them at runtime. They help IDEs, linters, and static analyzers.

First-Class and Higher-Order Functions

Functions can be passed around like any other valueβ€”useful for scan pipelines, dispatch tables, or callbacks.

Lambda Expressions

Anonymous, single-expression functions. Good for simple sorting, filtering or scoring logic.

Closures

A closure remembers the environment in which it was createdβ€”useful for counters, tracking state, or small in-memory registries.

Decorators

Decorators wrap a function to extend behaviorβ€”perfect for timing scans, adding logging, or enforcing preconditions.

Generator Functions

Generators produce values lazily, ideal for processing large lists of hosts or ports without consuming a lot of memory.

Recursion

Python supports recursion, but the stack depth is limited (~1000). For deep structures (e.g., large nested JSON scan results), iterative solutions are safer.

Useful Standard Library Tools

  • functools.lru_cache – cache repeated lookups (e.g., DNS lookups, parsing).

  • functools.partial – pre‑configure a function (e.g., fix a port or timeout).

  • itertools – lazy pipelines for processing scan data.

  • contextlib.contextmanager – create with-style context managers (e.g., temporary connections).

Introspection

Inspect functions at runtimeβ€”a must for plugin systems, dynamic dispatch or auto‑generating CLI tools.

Performance Notes

  • Local variables are faster than globals.

  • Default arguments are evaluated once, so avoid expensive defaults.

Modules & Packages

In offensive security, Python scripts frequently evolve from small proof-of-concepts into full operator toolkits. As the codebase grows, modules and packages become essential for maintaining clarity, reusability, and operational reliability.

Modules

A module is a single .py file containing names such as variables, classes, and functions.

Importing Modules

Importing a full module: import module binds the entire module object.

Importing specific functions: from module import x binds only the referenced names.

Aliasing for cleanner code: as provides an alias, often useful when avoiding naming collisions in large frameworks.

__all__ and import *

__all__ defines the list of names that a module intentionally exposes when someone uses:

Example:

With this, only PublicClass and public_fn are imported.

Without __all__, import * imports every name not starting with an underscore, which may expose internal helpers unintentionally.

Using __all__ makes the module’s public API explicit and controlled.

Modules Execution Context

Every Python module has a specific execution context, and two dunder variables are automatically defined:

  • __name__: indicates how the module is being executed.

    • When the file is run directly: __name__ == "__main__".

    • When the file is imported: __name__ becomes the module’s name (e.g., "my_module").

  • __package__: indicates which package the module belongs to.

    • Inside a package, it stores the package name.

    • At the top level, it is an empty string.

This enables the script guard, a pattern that ensures certain code (e.g., test routines or demos) runs only when the file is executed directly, and not when imported as a module.

Module Search Path

Python resolves imports using the Module Search Path, visible with:

Search order:

  1. Script directory

  2. ZIP files in sys.path

  3. Standard library

  4. PYTHONPATH entries

sys.path can be edited, but virtual environments or proper packaging are preferred.

Packages and Sub-packages

A package is a directory with an optional __init__.py file.

It can contain:

  • modules

  • sub-packages

  • nested hierarchies (e.g., c2/http/handlers.py)

Packages enable the creation of larger, organized red-team frameworks instead of single-file scripts.

Key points:

  • Absolute imports (mypkg.core) are clearer and recommended (PEP 328).

  • Relative imports (using .) only work inside packages.

  • Without __init__.py, Python creates a namespace package (PEP 420), useful for large plugin-style projects.

The __init__.py File

__init__.py defines what a package exposes when imported. It can:

  • Specify public exports with __all__

  • Re-export useful names at the package level

  • Stay lightweight (avoid heavy work on import)

Resource Files Inside Packages

Python packages can include resource files (e.g., wordlists, templates, payloads). Since Python 3.9, the recommended way to load them is:

This method works even if the package is distributed as a zip or a wheel, making resource access reliable and portable.

Installing External Packages

Python packages are installed with pip:

requirements.txt captures all installed versions.

For reproducible buildsβ€”especially important in security toolingβ€”use exact versions:

Distributing a Python Package

A pyproject.toml defines the metadata and configuration needed to package and publish a Python project:

console_scripts creates a cross-platform command-line entry point (e.g., running mypkg launches mypkg.cli.main()).

Build and publish workflow:

Each command plays a different role:

  • install build/twine β†’ prepares the environment

  • build β†’ packages your project

  • upload β†’ publishes it to a repository

Virtual Environments

A virtual environment creates an isolated Python interpreter with its own site-packages. This avoids dependency conflicts between projects and ensures repeatable setups.

Creation and activation:

Once activated, all pip installs go inside .venv/, not the system Python.

Why it matters:

  • Keeps tools and libraries separated per project

  • Avoids version collisions (e.g., offensive-security scripts needing older libs)

  • Ensures reproducible environments when sharing code or deploying

  • Prevents breaking system-wide Python packages

File Handling

File handling is a core capability in Python, and it becomes especially important in offensive security tooling, where scripts frequently need to read wordlists, store results, process logs, or manipulate extracted data.

Open Function

open() is the standard interface for working with files in Python. It controls how the file is accessed (whether it's read, written, appended, or opened in binary mode) and also handles encoding and buffering.

Function signature:

Common modes

Mode
Meaning

"r"

read (default)

"w"

write, truncating existing file

"a"

append (create if missing)

"x"

create exclusively, fail if file exists

"+"

update (read + write)

add "b"

binary mode ("rb", "wb", etc.)

Manually closing files is required to flush buffers and release OS handles.

The preferred pattern is the context manager form:

Within a with block, __enter__ opens the file and __exit__ ensures it closes cleanlyβ€”even if an exception occurs.

Text vs Binary

  • Text mode decodes bytes into str using the specified encoding (defaults to the platform’s encoding).

  • Binary mode ("rb", "wb") returns raw bytes with no decoding applied.

Binary mode is appropriate for non-text data (images, executables, packets, compressed files) or in scenarios where exact byte preservation is required.

Patterns

Reading Patterns

Python provides several file-reading patterns, each suited to different performance and memory needs.

Method
Description

fh.read()

Reads the entire file into memory. Not suitable for very large files.

fh.readline()

Reads one line; the returned string includes the trailing newline.

fh.readlines()

Reads all lines into a list.

for line in fh:

Lazy iteration; memory-efficient for large files.

fh.read(size)

Reads size bytes; returns fewer when EOF is reached.

Example pattern (simple tail implementation):

Writting Patterns

Writing uses the same distinction as reading: text mode writes str, while binary mode writes raw bytes. This is common in offensive tooling when generating payloads, saving extracted artifacts, or storing structured results.

Method
Description

fh.write(data)

Writes a string (text mode) or bytes (binary mode). Returns the number of characters/bytes written.

fh.writelines(seq)

Writes a sequence of strings or bytes without adding newlines automatically.

fh.flush()

Forces buffered content to be written to disk immediately.

fh.close()

Flushes buffers and releases the file handle. Automatically handled by with.

Buffered writing is usually faster; unbuffered mode (buffering=0) should be used only when strict real-time writing is needed.

File Positioning

File positioning allows precise control over where the next read or write occurs. This is useful when handling logs, binary blobs, or structured data extracted during offensive operations.

seek() moves the file pointer, while tell() reports its current position.

Offsets are measured from:

  • the start of the file (whence=0),

  • the current position (whence=1), or

  • the end of the file (whence=2).

In text mode, seeking is limited to positions previously returned by tell(). Binary mode provides full byte-level control.

The pathlib Module

pathlib.Path provides an object-oriented interface for working with filesystem paths. It replaces many os.path calls and produces cleaner, safer code (especially helpful when handling logs, wordlists, payloads, or extracted data during offensive operations).

Common Path Methods

Method
Purpose

.exists()

Checks if the file or directory exists

.stat()

Returns metadata such as size, timestamps, and permissions

.iterdir()

Iterates over directory contents

.glob(pattern)

Finds files matching patterns (e.g., "*.txt")

.read_text() / .write_text()

Read/write text files

.read_bytes() / .write_bytes()

Read/write binary files

.unlink()

Deletes a file

.rename(target)

Renames or moves a file

.mkdir(parents=True, exist_ok=True)

Creates directories

.resolve()

Returns the absolute, canonical path

Example

This approach keeps path handling clean, avoids string concatenation errors, and works consistently across operating systems.

Directory and Metadata Operations

Python provides several modules for interacting with the filesystem beyond simple file reads and writes. os, shutil, and stat expose functions for listing directories, copying files, modifying permissions, and manipulating metadata (operations frequently used in offensive tooling when collecting evidence, staging payloads, or archiving output).

Common Functions

Module / Function
Purpose

os.listdir(path)

Lists directory contents

os.makedirs(path, exist_ok=True)

Creates directory trees safely

os.chmod(path, mode)

Changes file permissions

shutil.copy2(src, dst)

Copies a file while preserving metadata (mtime, permissions)

shutil.copytree(src, dst)

Recursively copies an entire directory

shutil.rmtree(path)

Recursively deletes a directory tree

shutil.make_archive(base, format, root_dir)

Creates zip/tar archives

stat.S_IRUSR, stat.S_IXUSR

Permission flags (read, execute for owner)

Example

shutil provides higher-level operations that bundle multiple system calls safely, reducing the risk of race conditions or inconsistent states; especially important when tooling handles logs, loot files, or output directories during offensive engagements.

Temporary Files and Atomic Writes

Temporary files are essential when a script needs to generate output safely without exposing partially written data. Python’s tempfile module provides utilities for creating secure, uniquely named temporary files.

An atomic write pattern works by writing data to a temporary file and then replacing the target file in a single filesystem operation. This guarantees that other processes never see an incomplete file (a useful property when storing scan results, staging payloads, or updating logs in environments where multiple tools may read the same files).

The os.replace call performs an atomic swap on most filesystems, ensuring that the final file appears fully formed. This pattern prevents corruption, race conditions, and half-written outputs (issues that can cause inconsistent results during offensive tooling execution).

Memory-Mapped Files

Memory-mapped files allow a script to treat file contents as if they were byte arrays in memory. This is especially valuable when dealing with large binaries, disk images, or forensic artifacts, since reads and writes occur without repeatedly copying data between Python and the operating system.

A memory-mapped region exposes the underlying file directly through slicing, enabling fast random access, useful when parsing structured binary formats, scanning raw disk sectors, or extracting payloads from large images.

Because the file is mapped into virtual memory, operations are efficient even for multi-gigabyte data sources. This pattern is well-suited for offensive tooling that needs to inspect or manipulate binary data at arbitrary offsets.

Common Structured Formats

Different file formats are frequently used in offensive toolingβ€”for configuration, storing results, exchanging structured data, or compressing large logs. The following cards summarise the most common ones.

JSON (JavaScript Object Notation)

Simple, human-readable, ideal for configuration files or structured outputs.

  • json.load / json.dump β†’ work with file handles

  • json.loads / json.dumps β†’ work with strings

  • Useful for tool configuration, module settings, or scan summaries.

Exception Handling

File operations commonly fail due to missing files, permission issues, or invalid paths. Handling these exceptions explicitly helps tools behave predictably, especially when dealing with user-supplied input or external resources.

  • It is recommended to catch specific exceptions rather than the broad OSError.

  • Useful subclasses include: IsADirectoryError, FileExistsError, NotADirectoryError, and others.

  • Targeted exception handling provides clearer error messages and safer behaviour, particularly in offensive tooling where unexpected paths or permissions are common.

File I/O Best Practices

Performance Guidelines

  • Read and write in chunks (e.g., read(65536)) for large files.

  • Use binary mode to avoid per-character encoding overhead when copying raw data.

  • Prefer Path.read_bytes() and Path.write_bytes() for concise code paths.

  • Use memory-mapped files (mmap) for random access on multi-gigabyte files.

  • Avoid many small writes; accumulate into a buffer or use io.BufferedWriter.

Security Guidelines

Concern
Mitigation

Path traversal (../../etc/passwd)

Validate or normalize user-supplied paths with Path.resolve(), then ensure the result stays inside the allowed directory.

Untrusted pickle data

Avoid loading untrusted pickle content; use JSON, MessagePack, or custom formats instead.

Race conditions (TOCTOU)

Use appropriate open flags (e.g., "xb" for exclusive creation) or write to a temporary file and then atomically rename.

Encoding pitfalls

Always specify encoding="utf-8" unless there is a specific need for another encoding.

Leaked file handles

Use with blocks consistently to guarantee file handles are closed.

Last updated