Reading Data from Files
Overview
Learn how to read file content using methods like read(), readline(), and readlines(). You'll practice iterating through files line by line and processing large datasets efficiently without overloading memory, handling files of any size.
What You Will Learn in This Lesson
By the end of this lesson, you will know:
- Reading entire files: Use
read()to get all file content at once. - Reading line by line: Use
readline()andreadlines()to process files line by line. - Iterating through files: Loop through files efficiently using a for loop.
- Processing large files: Learn techniques for handling files that are too big to load into memory.
- Common patterns: Master the most common ways to read and process file data.
Three Ways to Read Files
Python provides three main methods for reading data from files. Each has its own use case:
| Method | What It Does | When to Use |
|---|---|---|
read() |
Reads the entire file as one string | Small files, when you need all content at once |
readline() |
Reads one line at a time | When you want to process line by line, or read just one line |
readlines() |
Reads all lines into a list | When you need all lines but want them as a list |
Method 1: read() - Read Everything
The read() method reads the entire file and returns it as a single string.
file = open("data.txt", "r")
content = file.read()
print(content)
file.close()
Example File: data.txt
Line 1
Line 2
Line 3
When you use read(), you get:
"Line 1\nLine 2\nLine 3"
Notice the \n characters - these represent newlines.
When to Use read()
Use read() for small files where you need all the content at once. For large files, this can use too much memory.
Method 2: readline() - Read One Line
The readline() method reads just one line from the file. Each time you call it, it reads the next line.
file = open("data.txt", "r")
line1 = file.readline() # Reads first line
line2 = file.readline() # Reads second line
line3 = file.readline() # Reads third line
print(line1) # "Line 1\n"
print(line2) # "Line 2\n"
print(line3) # "Line 3\n"
file.close()
Note: readline() includes the newline character (\n) at the end of each line. If the file ends, it returns an empty string.
file = open("data.txt", "r")
while True:
line = file.readline()
if not line: # Empty string means end of file
break
print(line.strip()) # strip() removes the \n
file.close()
Method 3: readlines() - Read All Lines
The readlines() method reads all lines and returns them as a list of strings.
file = open("data.txt", "r")
lines = file.readlines()
file.close()
print(lines)
# Output: ['Line 1\n', 'Line 2\n', 'Line 3\n']
# Process each line
for line in lines:
print(line.strip())
When to Use readlines()
Use readlines() when you need all lines as a list, which is useful for processing or manipulating the lines. Like read(), this loads everything into memory, so use it for smaller files.
Best Method: Iterating with a For Loop
The most Pythonic and memory-efficient way to read a file is to iterate over it directly with a for loop. This reads one line at a time without loading everything into memory.
file = open("data.txt", "r")
for line in file:
print(line.strip()) # Process each line
file.close()
Why This is Best
- Memory efficient: Only one line is in memory at a time
- Simple: Clean, readable code
- Fast: Works well even for very large files
- Pythonic: This is the recommended Python way
file = open("scores.txt", "r")
total = 0
count = 0
for line in file:
score = int(line.strip()) # Convert to integer
total += score
count += 1
file.close()
average = total / count
print(f"Average score: {average}")
Practice: Reading Files
Try It YourselfTry reading a file using different methods. Notice how each method works:
What happened? You simulated reading a file line by line. In a real program, you would use for line in file: to iterate through the file.
Removing Newlines
When you read lines from a file, each line ends with a newline character (\n). Often, you'll want to remove this.
strip() to Remove Newlinesfile = open("data.txt", "r")
for line in file:
cleaned = line.strip() # Removes \n and whitespace
print(cleaned)
file.close()
strip() Method
The strip() method removes whitespace (including newlines) from both ends of a string:
line = " Hello World \n"
cleaned = line.strip() # "Hello World"
Reading Specific Amounts
You can also read a specific number of characters from a file by passing a number to read().
file = open("data.txt", "r")
first_10 = file.read(10) # Read first 10 characters
print(first_10)
file.close()
Use case: This is useful when you only need part of a file, or when reading binary files where you need specific byte counts.
Summary
In this lesson, you learned:
read(): Reads entire file as one string - use for small filesreadline(): Reads one line at a time - use when you need line-by-line controlreadlines(): Reads all lines as a list - use when you need a list of lines- For loop iteration: Best method - memory efficient and Pythonic
strip(): Removes newlines and whitespace from strings
Remember
For most cases, use for line in file: - it's the most efficient and Pythonic way to read files!
End-of-Lesson Exercises
Think about these questions to reinforce what you've learned:
Exercise 1: Which Method?
When would you use read() vs readlines() vs a for loop? Give an example for each.
Exercise 2: Processing Lines
Write pseudocode for reading a file and counting how many lines contain the word "Python".