java.nio.file is the production baseline for file workflows in Java.
Compared to legacy java.io.File, it gives better path safety, clearer APIs, and stronger behavior around moves, links, and attributes.
Why NIO.2 Matters in Real Systems
Backend services often depend on file I/O for imports, exports, checkpoints, and recovery markers. Most incidents are not about reading bytes, they are about edge cases:
- partial writes after process crash
- path traversal through user input
- cross-filesystem move surprises
- silent overwrites and missing recovery metadata
NIO.2 gives enough primitives to handle these explicitly.
Core Building Blocks
Path: normalized, composable path abstractionFiles: high-level operations (read/write/move/walk/attributes)StandardOpenOption: explicit write/append/create semanticsStandardCopyOption: replace/atomic move behaviorFileVisitor: controlled recursive traversal
Example 1: Safe Path Resolution (Traversal Defense)
When a path fragment comes from request payload or job metadata, never directly join and trust it.
Path root = Path.of("/srv/imports").toAbsolutePath().normalize();
String userRelative = "tenantA/../tenantA/orders.csv"; // untrusted
Path resolved = root.resolve(userRelative).normalize();
if (!resolved.startsWith(root)) {
throw new SecurityException("Path traversal attempt: " + userRelative);
}
This small check prevents writing or reading outside the allowed directory tree.
Example 2: Crash-Safe File Replace (Temp + Atomic Move)
Use this for config/state files that must never be half-written.
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.*;
public final class AtomicFileWriter {
public static void writeAtomically(Path target, String content) throws IOException {
Path parent = target.toAbsolutePath().getParent();
Files.createDirectories(parent);
Path temp = Files.createTempFile(parent, target.getFileName().toString(), ".tmp");
try {
Files.writeString(
temp,
content,
StandardCharsets.UTF_8,
StandardOpenOption.TRUNCATE_EXISTING
);
Files.move(
temp,
target,
StandardCopyOption.ATOMIC_MOVE,
StandardCopyOption.REPLACE_EXISTING
);
} finally {
Files.deleteIfExists(temp);
}
}
}
Notes:
ATOMIC_MOVEis strongest when source and target are on same filesystem.- keep temp file in same directory as target to maximize atomic move support.
Example 3: Large Directory Scan with Bounded Depth
Path root = Path.of("/data/imports");
try (var stream = Files.find(root, 3,
(p, attr) -> attr.isRegularFile() && p.toString().endsWith(".json"))) {
stream.forEach(this::processJsonFile);
}
Why Files.find over manual recursion:
- depth limit is explicit
- access to attributes without extra calls
- avoids accidental unbounded scans
Dry Run: Import Pipeline with Recovery
Assume /data/inbox/orders-2026-05-08.csv arrives.
- service reads from inbox as stream (does not load full file).
- transformed output is written to
/data/stage/orders-2026-05-08.csv.tmp. - checksum/row-count validation is performed.
- output is moved atomically to
/data/processed/orders-2026-05-08.csv. - checkpoint file
/data/checkpoints/orders-2026-05-08.doneis created.
Crash cases:
- crash before step 4: only
.tmpexists, job is safe to retry. - crash after step 4 but before step 5: final file exists without checkpoint; restart logic can detect and reconcile.
This is the difference between “works locally” and “operationally recoverable”.
Common Mistakes
- using
Path.of(base + "/" + userInput)string concatenation everywhere - reading large files with
readAllBytes/readStringby default - moving files across mount points and assuming atomicity
- ignoring symbolic links in cleanup logic
Production Checklist
- normalize and boundary-check any externally influenced path.
- choose explicit open options (
CREATE_NEW,TRUNCATE_EXISTING,APPEND) instead of defaults. - prefer stream/chunk processing for large files.
- use temp + atomic move for durable updates.
- include restart semantics (checkpoint marker + idempotent processing).
Key Takeaways
- NIO.2 is not just a nicer API; it enables safer failure handling patterns.
- treat file operations like distributed systems boundaries: explicit contracts, retries, and recovery.
- correctness comes from path validation + atomic replace + predictable restart behavior.
Categories
Tags