Python Offensive Security 101: Python Fundamentals
Notes based on the 'Introduction to Python for Offensive Security' course by Red Team Leaders (https://courses.redteamleaders.com/courses/ce82d160-2ef7-4f2c-8b15-e5ea742b1877 ).
Data Types
Summary
Arbitrary-Precision Integers int
Python integers have unlimited precision. No 32/64-bit limits.
Common Security Uses:
Addresses, offsets, opcode fields
Encoding binary values (IP, ports)
Cryptographic math (RSA, ECC)
Payload and packet manipulation
Advantages:
Native support for big numbers
Pitfalls:
You must know how many bytes to allocate in
.to_bytes()Large values may crash when packed into small byte fields
EXAMPLE:
n = 0x1337
n.bit_length() # β 13
n.to_bytes(2, 'big') # β b'\x13\x37'
int.from_bytes(b'\x13\x37', 'big') # β 4919Binary Floating Point (IEEE 754) float
64-bit binary floats, prone to rounding errors.
Common Security Uses:
Packet timestamps
Timing-based attacks
Network latency measurements
Pitfalls:
Tiny rounding errors can ruin time-sensitive exploits
EXAMPLE:
Booleans bool
Subclass of int. True == 1, False == 0.
Common Uses:
Network state checks
Execution flags
Guard conditions in exploits
Pitfalls:
Nonemay look falsy but may mean "no value" (not False)
EXAMPLE:
Immutable Byte Sequences bytes
Raw bytes (0β255), immutable.
Common Security Uses:
Shellcode
Network protocol crafting
Hashing, signatures, binary blobs
Pitfalls:
Cannot modify without creating a copy
EXAMPLE:
Mutable Byte Sequences bytearray
Like bytes, but modifiable in-place.
Common Security Uses:
Shellcode patching
Dynamic payloads
Manual field injection in binary headers
Advantages:
You can edit without reallocating
Pitfalls:
Not hashable β cannot use as dictionary keys
EXAMPLE:
Unicode Text str
Unicode strings, used for readable text manipulation.
Common Security Uses:
Shell command composition
Fuzzing input
Regex on logs or files
Filenames, domains, paths
Pitfalls:
String concatenation with
+in loops is slow
EXAMPLE:
Binary Packing & Unpacking struct
Format numbers into raw byte sequences, or extract them.
Common Security Uses:
Building headers (DNS, TCP/IP, etc.)
Parsing binary formats
Exploit payload structure
Pitfalls:
Incorrect byte order = broken payload
Format codes:
!β Network (big-endian)Hβ 2-byte unsigned shortIβ 4-byte unsigned intBβ 1 byte
EXAMPLE:
Quick Function Reference Table
int(x, base)
int
Convert string to int (e.g., hex)
int.from_bytes(b, endian)
int β bytes
Bytes β Integer
n.to_bytes(n_bytes, endian)
int β bytes
Integer β Bytes
hex(n)
int
Int β hex string
bytes.fromhex(s)
bytes
Hex string β Bytes
b.hex()
bytes
Bytes β Hex string
bytearray(b)
bytearray
Mutable byte buffer
struct.pack(fmt, ...)
struct
Pack data to binary format
struct.unpack(fmt, b)
struct
Unpack binary to values
isclose(a, b)
float
Approximate float comparison
Decimal("0.1")
decimal
Exact decimal arithmetic
Fraction(1, 3)
fraction
Rational numbers
s.find(sub)
str
Find substring
s.replace(a, b)
str
Replace substrings
" ".join(list)
str
Join list of strings
binascii.hexlify(b)
bytes
Manual hex encoding
base64.b64encode(b)
bytes
Encode payload in base64
hashlib.sha256(b).hexdigest()
hashlib
SHA-256 hash for binary
Mini CheatSheet Snippets
Data Structures
Summary
Lists list
Lists = mutable, ordered containers, great for short-lived target batches, scan results, and small queues. Donβt use them as huge persistent stores.
Basic Ops
Filtering & Transforming
Stacks & Queues
Use deque instead of list.pop(0) for performance.
Slicing & Copying
Performance & Memory
Each slot stores a pointer (~8 bytes).
Use
array,numpy, or generators for large numeric data.list.clear()reuses memory instead of reallocating.
Security & Concurrency
Not thread-safe β use
queue.Queuefor parallel scans.Donβt keep secrets in lists. Clear (
clear()/del) sensitive data after use.Serialize results carefully (avoid leaking credentials or tokens).
EXAMPLE:
Tuples tuple
Tuples = immutable, ordered containers. Great for storing coordinates, IP/port pairs, or any read-only sequences in scans and pipelines.
Basic Ops
Immutable β cannot append, delete, or change elements.
Common Use Cases
Keys in dicts or sets β hashable and safe
Return multiple values from a function:
return ip, port, statusRead-only sequences β e.g., default headers, config tuples
Unpacking Tricks
Handy for extracting first/last elements or splitting ranges in scanners
Namedtuple / Dataclass
Gives attribute access
Immutable & memory-light
Perfect for storing scan results, port statuses, or target info
EXAMPLE
Dictionaries dict
Dicts = mutable, key-value stores . Ideal for mapping services, ports, hosts, scan results.
Basic Ops
Iteration View
Useful Variants
Security Note
Hash-flood attacks: huge sets of colliding keys can slow lookups to O(n).
CPython 3.3+ uses random hash seeds to mitigate this, still relevant when fuzzing protocols or parsing untrusted input.
EXAMPLE:
Sets set
Sets = unordered, mutable collections for fast membership and deduplication. Ideal for pruning wordlists and tracking visited hosts.
Basic Ops
Useful methods
Comprehension & math ops
Use cases (offsec)
Prune huge wordlists (
uniq = set(wordlist))Track visited nodes in graph/bfs/dfs
Fast membership checks before expensive probes
Perf & security notes
Membership = O(1) average.
Sets use hashing like dicts β vulnerable conceptually a hash-collision flood (same mitigations as dicts).
Prefer sets for memory-light dedup of moderate-size lists; for huge datasets consider Bloom filters or disk-backed DB.
Characteristics Summary
Mutability
Yes
No
Yes
Yes
Ordering
Yes
Yes
Yes
No
Allows Duplicates
Yes
Yes
Keys unique, values can duplicate
No
Hashable
No
Yes
Keys only
Yes
Typical Use (offsec)
Target batches, scan results, stacks/queues
IP/port pairs, coordinates, read-only constants
Map serviceβport, results keyed by host/IP
Dedup targets, fast membership checks, visited nodes
Control Flow
Boolean Contexts
Any object can be tested for truth:
False:
False, None, 0, 0.0, "", [], {}, set()True: everything else
Conditional Expressions
Loops
while
whileEvaluates before each iteration
while ... elseexecuteselseif loop exits normally (no break)
for
forIterates over any iterable
Supports
enumerate,zip, unpackingfor ... elseexecuteselseif loop completes normally
Loop Control
break
Exit innermost loop immediately
continue
Skip rest of iteration, continue loop
pass
Do nothing (placeholder)
Comprehensions
Build lists, sets, dicts, generators concisely
Own local scope; avoids leaking loop vars
Pattern Matching (match ... case) 3.10+
match ... case) 3.10+Multi-way branch with destructuring
Functions
Definition
In Python, functions are defined with def followed by a name. The docstring describes what the function does and can be viewed using help(). The return statement sends a value back; if not used, the function returns None.
Argument Handling
Python uses pass-by-object-reference, meaning that if a function modifies a mutable object, the change is reflected outside the function. Reassigning the parameter does nothing; only inβplace mutations affect the caller.
Types of Calls
Positional: good for simple, fast operations (e.g., quick scan routines).
Keyword: clearer when optional parameters matter.
Mixed: combines both styles.
Default Argument Values
Default values are evaluated once, when the function is defined. Using mutable defaults can cause unexpected shared state between calls.
Bad example:
Good example:
Variable-Length Parameters
*argscollects extra positional arguments (e.g., multiple hosts).**kwargscollects keyword metadata (e.g., tool name, scan type).
Calling with unpacking:
Positional-Only and Keyword-Only Parameters
Positional-only helps avoid accidental keyword usage.
Keyword-only ensures clarity for functions that require explicit naming.
Advanced Functions
Type Annotations and Type Hints
Annotations document expected types but do not enforce them at runtime. They help IDEs, linters, and static analyzers.
First-Class and Higher-Order Functions
Functions can be passed around like any other valueβuseful for scan pipelines, dispatch tables, or callbacks.
Lambda Expressions
Anonymous, single-expression functions. Good for simple sorting, filtering or scoring logic.
Closures
A closure remembers the environment in which it was createdβuseful for counters, tracking state, or small in-memory registries.
Decorators
Decorators wrap a function to extend behaviorβperfect for timing scans, adding logging, or enforcing preconditions.
Generator Functions
Generators produce values lazily, ideal for processing large lists of hosts or ports without consuming a lot of memory.
Recursion
Python supports recursion, but the stack depth is limited (~1000). For deep structures (e.g., large nested JSON scan results), iterative solutions are safer.
Useful Standard Library Tools
functools.lru_cache β cache repeated lookups (e.g., DNS lookups, parsing).
functools.partial β preβconfigure a function (e.g., fix a port or timeout).
itertools β lazy pipelines for processing scan data.
contextlib.contextmanager β create
with-style context managers (e.g., temporary connections).
Introspection
Inspect functions at runtimeβa must for plugin systems, dynamic dispatch or autoβgenerating CLI tools.
Performance Notes
Local variables are faster than globals.
Default arguments are evaluated once, so avoid expensive defaults.
Modules & Packages
In offensive security, Python scripts frequently evolve from small proof-of-concepts into full operator toolkits. As the codebase grows, modules and packages become essential for maintaining clarity, reusability, and operational reliability.
Modules
A module is a single .py file containing names such as variables, classes, and functions.
Importing Modules
Importing a full module: import module binds the entire module object.
Importing specific functions: from module import x binds only the referenced names.
Aliasing for cleanner code: as provides an alias, often useful when avoiding naming collisions in large frameworks.
__all__ and import *
__all__ and import *__all__ defines the list of names that a module intentionally exposes when someone uses:
Example:
With this, only PublicClass and public_fn are imported.
Without __all__, import * imports every name not starting with an underscore, which may expose internal helpers unintentionally.
Using __all__ makes the moduleβs public API explicit and controlled.
Modules Execution Context
Every Python module has a specific execution context, and two dunder variables are automatically defined:
__name__: indicates how the module is being executed.When the file is run directly:
__name__ == "__main__".When the file is imported:
__name__becomes the moduleβs name (e.g.,"my_module").
__package__: indicates which package the module belongs to.Inside a package, it stores the package name.
At the top level, it is an empty string.
This enables the script guard, a pattern that ensures certain code (e.g., test routines or demos) runs only when the file is executed directly, and not when imported as a module.
Module Search Path
Python resolves imports using the Module Search Path, visible with:
Search order:
Script directory
ZIP files in
sys.pathStandard library
PYTHONPATHentries
sys.path can be edited, but virtual environments or proper packaging are preferred.
Packages and Sub-packages
A package is a directory with an optional __init__.py file.
It can contain:
modules
sub-packages
nested hierarchies (e.g.,
c2/http/handlers.py)
Packages enable the creation of larger, organized red-team frameworks instead of single-file scripts.
Key points:
Absolute imports (
mypkg.core) are clearer and recommended (PEP 328).Relative imports (using
.) only work inside packages.Without
__init__.py, Python creates a namespace package (PEP 420), useful for large plugin-style projects.
The __init__.py File
__init__.py File__init__.py defines what a package exposes when imported. It can:
Specify public exports with
__all__Re-export useful names at the package level
Stay lightweight (avoid heavy work on import)
Resource Files Inside Packages
Python packages can include resource files (e.g., wordlists, templates, payloads). Since Python 3.9, the recommended way to load them is:
This method works even if the package is distributed as a zip or a wheel, making resource access reliable and portable.
Installing External Packages
Python packages are installed with pip:
requirements.txt captures all installed versions.
For reproducible buildsβespecially important in security toolingβuse exact versions:
Distributing a Python Package
A pyproject.toml defines the metadata and configuration needed to package and publish a Python project:
console_scripts creates a cross-platform command-line entry point (e.g., running mypkg launches mypkg.cli.main()).
Build and publish workflow:
Each command plays a different role:
install build/twine β prepares the environment
build β packages your project
upload β publishes it to a repository
Virtual Environments
A virtual environment creates an isolated Python interpreter with its own site-packages.
This avoids dependency conflicts between projects and ensures repeatable setups.
Creation and activation:
Once activated, all pip installs go inside .venv/, not the system Python.
Why it matters:
Keeps tools and libraries separated per project
Avoids version collisions (e.g., offensive-security scripts needing older libs)
Ensures reproducible environments when sharing code or deploying
Prevents breaking system-wide Python packages
File Handling
File handling is a core capability in Python, and it becomes especially important in offensive security tooling, where scripts frequently need to read wordlists, store results, process logs, or manipulate extracted data.
Open Function
open() is the standard interface for working with files in Python. It controls how the file is accessed (whether it's read, written, appended, or opened in binary mode) and also handles encoding and buffering.
Function signature:
Common modes
"r"
read (default)
"w"
write, truncating existing file
"a"
append (create if missing)
"x"
create exclusively, fail if file exists
"+"
update (read + write)
add "b"
binary mode ("rb", "wb", etc.)
Manually closing files is required to flush buffers and release OS handles.
The preferred pattern is the context manager form:
Within a with block, __enter__ opens the file and __exit__ ensures it closes cleanlyβeven if an exception occurs.
Text vs Binary
Text mode decodes bytes into
strusing the specified encoding (defaults to the platformβs encoding).Binary mode (
"rb","wb") returns raw bytes with no decoding applied.
Binary mode is appropriate for non-text data (images, executables, packets, compressed files) or in scenarios where exact byte preservation is required.
Patterns
Reading Patterns
Python provides several file-reading patterns, each suited to different performance and memory needs.
fh.read()
Reads the entire file into memory. Not suitable for very large files.
fh.readline()
Reads one line; the returned string includes the trailing newline.
fh.readlines()
Reads all lines into a list.
for line in fh:
Lazy iteration; memory-efficient for large files.
fh.read(size)
Reads size bytes; returns fewer when EOF is reached.
Example pattern (simple tail implementation):
Writting Patterns
Writing uses the same distinction as reading: text mode writes str, while binary mode writes raw bytes. This is common in offensive tooling when generating payloads, saving extracted artifacts, or storing structured results.
fh.write(data)
Writes a string (text mode) or bytes (binary mode). Returns the number of characters/bytes written.
fh.writelines(seq)
Writes a sequence of strings or bytes without adding newlines automatically.
fh.flush()
Forces buffered content to be written to disk immediately.
fh.close()
Flushes buffers and releases the file handle. Automatically handled by with.
Buffered writing is usually faster; unbuffered mode (buffering=0) should be used only when strict real-time writing is needed.
File Positioning
File positioning allows precise control over where the next read or write occurs. This is useful when handling logs, binary blobs, or structured data extracted during offensive operations.
seek() moves the file pointer, while tell() reports its current position.
Offsets are measured from:
the start of the file (
whence=0),the current position (
whence=1), orthe end of the file (
whence=2).
In text mode, seeking is limited to positions previously returned by tell(). Binary mode provides full byte-level control.
The pathlib Module
pathlib Modulepathlib.Path provides an object-oriented interface for working with filesystem paths. It replaces many os.path calls and produces cleaner, safer code (especially helpful when handling logs, wordlists, payloads, or extracted data during offensive operations).
Common Path Methods
Path Methods.exists()
Checks if the file or directory exists
.stat()
Returns metadata such as size, timestamps, and permissions
.iterdir()
Iterates over directory contents
.glob(pattern)
Finds files matching patterns (e.g., "*.txt")
.read_text() / .write_text()
Read/write text files
.read_bytes() / .write_bytes()
Read/write binary files
.unlink()
Deletes a file
.rename(target)
Renames or moves a file
.mkdir(parents=True, exist_ok=True)
Creates directories
.resolve()
Returns the absolute, canonical path
Example
This approach keeps path handling clean, avoids string concatenation errors, and works consistently across operating systems.
Directory and Metadata Operations
Python provides several modules for interacting with the filesystem beyond simple file reads and writes. os, shutil, and stat expose functions for listing directories, copying files, modifying permissions, and manipulating metadata (operations frequently used in offensive tooling when collecting evidence, staging payloads, or archiving output).
Common Functions
os.listdir(path)
Lists directory contents
os.makedirs(path, exist_ok=True)
Creates directory trees safely
os.chmod(path, mode)
Changes file permissions
shutil.copy2(src, dst)
Copies a file while preserving metadata (mtime, permissions)
shutil.copytree(src, dst)
Recursively copies an entire directory
shutil.rmtree(path)
Recursively deletes a directory tree
shutil.make_archive(base, format, root_dir)
Creates zip/tar archives
stat.S_IRUSR, stat.S_IXUSR
Permission flags (read, execute for owner)
Example
shutil provides higher-level operations that bundle multiple system calls safely, reducing the risk of race conditions or inconsistent states; especially important when tooling handles logs, loot files, or output directories during offensive engagements.
Temporary Files and Atomic Writes
Temporary files are essential when a script needs to generate output safely without exposing partially written data. Pythonβs tempfile module provides utilities for creating secure, uniquely named temporary files.
An atomic write pattern works by writing data to a temporary file and then replacing the target file in a single filesystem operation. This guarantees that other processes never see an incomplete file (a useful property when storing scan results, staging payloads, or updating logs in environments where multiple tools may read the same files).
The os.replace call performs an atomic swap on most filesystems, ensuring that the final file appears fully formed. This pattern prevents corruption, race conditions, and half-written outputs (issues that can cause inconsistent results during offensive tooling execution).
Memory-Mapped Files
Memory-mapped files allow a script to treat file contents as if they were byte arrays in memory. This is especially valuable when dealing with large binaries, disk images, or forensic artifacts, since reads and writes occur without repeatedly copying data between Python and the operating system.
A memory-mapped region exposes the underlying file directly through slicing, enabling fast random access, useful when parsing structured binary formats, scanning raw disk sectors, or extracting payloads from large images.
Because the file is mapped into virtual memory, operations are efficient even for multi-gigabyte data sources. This pattern is well-suited for offensive tooling that needs to inspect or manipulate binary data at arbitrary offsets.
Common Structured Formats
Different file formats are frequently used in offensive toolingβfor configuration, storing results, exchanging structured data, or compressing large logs. The following cards summarise the most common ones.
Exception Handling
File operations commonly fail due to missing files, permission issues, or invalid paths. Handling these exceptions explicitly helps tools behave predictably, especially when dealing with user-supplied input or external resources.
It is recommended to catch specific exceptions rather than the broad
OSError.Useful subclasses include:
IsADirectoryError,FileExistsError,NotADirectoryError, and others.Targeted exception handling provides clearer error messages and safer behaviour, particularly in offensive tooling where unexpected paths or permissions are common.
File I/O Best Practices
Performance Guidelines
Read and write in chunks (e.g.,
read(65536)) for large files.Use binary mode to avoid per-character encoding overhead when copying raw data.
Prefer
Path.read_bytes()andPath.write_bytes()for concise code paths.Use memory-mapped files (
mmap) for random access on multi-gigabyte files.Avoid many small writes; accumulate into a buffer or use
io.BufferedWriter.
Security Guidelines
Path traversal (../../etc/passwd)
Validate or normalize user-supplied paths with Path.resolve(), then ensure the result stays inside the allowed directory.
Untrusted pickle data
Avoid loading untrusted pickle content; use JSON, MessagePack, or custom formats instead.
Race conditions (TOCTOU)
Use appropriate open flags (e.g., "xb" for exclusive creation) or write to a temporary file and then atomically rename.
Encoding pitfalls
Always specify encoding="utf-8" unless there is a specific need for another encoding.
Leaked file handles
Use with blocks consistently to guarantee file handles are closed.
Last updated
