CSV Files
CSV (Comma-Separated Values) is one of the most common data formats - every
spreadsheet application can export to it, databases can import it, and it's human-readable.
Python's built-in csv module handles all the tricky parts: quoting, escaping,
different delimiters, and headers. Whether you're processing financial data, user records, or
scientific measurements, mastering CSV is essential!
Reading CSV Files
The csv.reader() returns an iterator that yields each row as a list of strings.
For more convenient access by column names, csv.DictReader() yields each row as
a dictionary with column headers as keys. This makes code more readable and maintainable.
Click Run to execute your code
csv.reader(f) - Rows as lists: row[0], row[1]csv.DictReader(f) - Rows as dicts: row['name'], row['age']Use reader for simple, positional access.
Use DictReader when columns have meaning - code is clearer!
Writing CSV Files
Use csv.writer() to write lists of values, and csv.DictWriter()
to write dictionaries. Always open files with newline="" on Windows to prevent
blank lines between rows. The writeheader() method writes column names.
Click Run to execute your code
newline="" when opening CSV files
for writing: open("file.csv", "w", newline=""). Without this, you'll get
extra blank lines between rows. This is because the csv module handles line endings itself.
Dialects and Options
Not all CSV files use commas! European files often use semicolons, tab-separated values (TSV) are common, and different applications quote fields differently. The csv module supports various dialects and lets you customize delimiter, quoting, and other options.
Click Run to execute your code
csv.QUOTE_MINIMAL - Only quote when needed (default)csv.QUOTE_ALL - Quote all fieldscsv.QUOTE_NONNUMERIC - Quote non-numeric fieldscsv.QUOTE_NONE - Never quote (escape special chars)Use QUOTE_ALL for maximum compatibility with other tools.
Practical Examples
In real projects, you'll process CSV data: calculate totals, filter rows, group by categories, add computed columns, and transform data. The csv module combines well with Python's built-in functions and collections for powerful data processing.
Click Run to execute your code
Common Mistakes
1. Forgetting newline="" on Windows
# Wrong - creates extra blank lines on Windows!
with open("data.csv", "w") as f:
writer = csv.writer(f)
writer.writerows(data)
# Correct - always use newline=""
with open("data.csv", "w", newline="") as f:
writer = csv.writer(f)
writer.writerows(data)
2. Not handling commas in data
# Wrong - manual string building breaks on commas!
data = ["Alice", "New York, NY", "30"]
line = ",".join(data) # "Alice,New York, NY,30" - 4 fields!
# Correct - let csv module handle quoting
with open("data.csv", "w", newline="") as f:
writer = csv.writer(f)
writer.writerow(data) # "Alice","New York, NY","30"
3. Assuming all fields are strings
# Wrong - CSV values are always strings!
with open("data.csv", "r") as f:
reader = csv.DictReader(f)
for row in reader:
total = row['quantity'] * row['price'] # TypeError!
# Correct - convert to numbers
with open("data.csv", "r") as f:
reader = csv.DictReader(f)
for row in reader:
total = int(row['quantity']) * float(row['price'])
4. Reading the same file twice
# Wrong - reader is exhausted after first loop!
with open("data.csv", "r") as f:
reader = csv.reader(f)
for row in reader:
print(row)
for row in reader: # Empty! Iterator exhausted
process(row)
# Correct - read into list or open again
with open("data.csv", "r") as f:
rows = list(csv.reader(f)) # Store in memory
for row in rows:
print(row)
for row in rows: # Works!
process(row)
5. DictWriter with missing keys
# Wrong - KeyError if dict missing a field!
fieldnames = ["name", "age", "city"]
data = {"name": "Alice", "age": 30} # Missing 'city'!
with open("data.csv", "w", newline="") as f:
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writerow(data) # KeyError: 'city'
# Correct - use extrasaction and restval
with open("data.csv", "w", newline="") as f:
writer = csv.DictWriter(
f, fieldnames=fieldnames,
extrasaction='ignore', # Ignore extra keys
restval="" # Default for missing keys
)
writer.writerow(data) # Works!
Exercise: Sales Report Generator
Task: Process sales data and generate a summary report CSV.
Requirements:
- Read a sales CSV with columns: product, quantity, price
- Calculate total revenue per product
- Write a summary CSV with: product, total_quantity, total_revenue
- Sort by revenue (highest first)
Click Run to execute your code
Show Solution
import csv
import os
from collections import defaultdict
# Create sample sales data
sales_csv = """product,quantity,price
Widget,10,9.99
Gadget,5,19.99
Widget,8,9.99
Gadget,12,19.99
Thing,20,5.00
Widget,15,9.99
Thing,25,5.00"""
with open("sales.csv", "w") as f:
f.write(sales_csv)
# Process sales data
totals = defaultdict(lambda: {"quantity": 0, "revenue": 0.0})
with open("sales.csv", "r") as f:
reader = csv.DictReader(f)
for row in reader:
product = row["product"]
qty = int(row["quantity"])
price = float(row["price"])
totals[product]["quantity"] += qty
totals[product]["revenue"] += qty * price
# Convert to sorted list (by revenue, descending)
summary = [
{"product": p, "total_quantity": d["quantity"], "total_revenue": f"{d['revenue']:.2f}"}
for p, d in sorted(totals.items(), key=lambda x: -x[1]["revenue"])
]
# Write summary report
with open("summary.csv", "w", newline="") as f:
fieldnames = ["product", "total_quantity", "total_revenue"]
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(summary)
# Display result
print("=== Sales Summary ===")
with open("summary.csv", "r") as f:
print(f.read())
# Cleanup
os.remove("sales.csv")
os.remove("summary.csv")
Summary
- Read with reader:
csv.reader(f)- rows as lists - Read with DictReader:
csv.DictReader(f)- rows as dicts - Write with writer:
csv.writer(f)withwriterow() - Write with DictWriter:
csv.DictWriter(f, fieldnames=[]) - Write header:
writer.writeheader() - Batch write:
writer.writerows(data) - Custom delimiter:
delimiter=";"ordelimiter="\t" - Windows fix: Always use
newline=""when writing - Quoting:
quoting=csv.QUOTE_ALLfor safety - Convert types: CSV values are strings - convert with int/float
What's Next?
CSV is great for tabular data, but for more complex structures you'll need JSON. JSON supports nested objects, arrays, and multiple data types - perfect for configuration files, API responses, and hierarchical data. Let's learn how to read and write JSON with Python!
Enjoying these tutorials?