Unit 7 • Lesson 3

Reading Data from Files

Overview

Learn how to read file content using methods like read(), readline(), and readlines(). You'll practice iterating through files line by line and processing large datasets efficiently without overloading memory, handling files of any size.

Beginner 20–25 min

What You Will Learn in This Lesson

By the end of this lesson, you will know:

  • Reading entire files: Use read() to get all file content at once.
  • Reading line by line: Use readline() and readlines() to process files line by line.
  • Iterating through files: Loop through files efficiently using a for loop.
  • Processing large files: Learn techniques for handling files that are too big to load into memory.
  • Common patterns: Master the most common ways to read and process file data.

Three Ways to Read Files

Python provides three main methods for reading data from files. Each has its own use case:

Method What It Does When to Use
read() Reads the entire file as one string Small files, when you need all content at once
readline() Reads one line at a time When you want to process line by line, or read just one line
readlines() Reads all lines into a list When you need all lines but want them as a list

Method 1: read() - Read Everything

The read() method reads the entire file and returns it as a single string.

Reading Entire File
file = open("data.txt", "r")
content = file.read()
print(content)
file.close()

Example File: data.txt

Line 1
Line 2
Line 3

When you use read(), you get:

"Line 1\nLine 2\nLine 3"

Notice the \n characters - these represent newlines.

When to Use read()

Use read() for small files where you need all the content at once. For large files, this can use too much memory.

Method 2: readline() - Read One Line

The readline() method reads just one line from the file. Each time you call it, it reads the next line.

Reading One Line at a Time
file = open("data.txt", "r")

line1 = file.readline()  # Reads first line
line2 = file.readline()  # Reads second line
line3 = file.readline()  # Reads third line

print(line1)  # "Line 1\n"
print(line2)  # "Line 2\n"
print(line3)  # "Line 3\n"

file.close()

Note: readline() includes the newline character (\n) at the end of each line. If the file ends, it returns an empty string.

Reading Until End of File
file = open("data.txt", "r")

while True:
    line = file.readline()
    if not line:  # Empty string means end of file
        break
    print(line.strip())  # strip() removes the \n

file.close()

Method 3: readlines() - Read All Lines

The readlines() method reads all lines and returns them as a list of strings.

Reading All Lines as a List
file = open("data.txt", "r")
lines = file.readlines()
file.close()

print(lines)
# Output: ['Line 1\n', 'Line 2\n', 'Line 3\n']

# Process each line
for line in lines:
    print(line.strip())

When to Use readlines()

Use readlines() when you need all lines as a list, which is useful for processing or manipulating the lines. Like read(), this loads everything into memory, so use it for smaller files.

Best Method: Iterating with a For Loop

The most Pythonic and memory-efficient way to read a file is to iterate over it directly with a for loop. This reads one line at a time without loading everything into memory.

Recommended: Iterate Over File
file = open("data.txt", "r")

for line in file:
    print(line.strip())  # Process each line

file.close()

Why This is Best

  • Memory efficient: Only one line is in memory at a time
  • Simple: Clean, readable code
  • Fast: Works well even for very large files
  • Pythonic: This is the recommended Python way
Processing Lines
file = open("scores.txt", "r")

total = 0
count = 0

for line in file:
    score = int(line.strip())  # Convert to integer
    total += score
    count += 1

file.close()

average = total / count
print(f"Average score: {average}")

Practice: Reading Files

Try It Yourself

Try reading a file using different methods. Notice how each method works:

Click "Run Code" to see your output here

What happened? You simulated reading a file line by line. In a real program, you would use for line in file: to iterate through the file.

Removing Newlines

When you read lines from a file, each line ends with a newline character (\n). Often, you'll want to remove this.

Using strip() to Remove Newlines
file = open("data.txt", "r")

for line in file:
    cleaned = line.strip()  # Removes \n and whitespace
    print(cleaned)

file.close()

strip() Method

The strip() method removes whitespace (including newlines) from both ends of a string:

line = "  Hello World  \n"
cleaned = line.strip()  # "Hello World"

Reading Specific Amounts

You can also read a specific number of characters from a file by passing a number to read().

Reading N Characters
file = open("data.txt", "r")

first_10 = file.read(10)  # Read first 10 characters
print(first_10)

file.close()

Use case: This is useful when you only need part of a file, or when reading binary files where you need specific byte counts.

Summary

In this lesson, you learned:

  • read(): Reads entire file as one string - use for small files
  • readline(): Reads one line at a time - use when you need line-by-line control
  • readlines(): Reads all lines as a list - use when you need a list of lines
  • For loop iteration: Best method - memory efficient and Pythonic
  • strip(): Removes newlines and whitespace from strings

Remember

For most cases, use for line in file: - it's the most efficient and Pythonic way to read files!

End-of-Lesson Exercises

Think about these questions to reinforce what you've learned:

Exercise 1: Which Method?

When would you use read() vs readlines() vs a for loop? Give an example for each.

Exercise 2: Processing Lines

Write pseudocode for reading a file and counting how many lines contain the word "Python".