Pandas Rename Column: 6 Methods to Rename DataFrame Columns in Python
Updated on
Column names in real-world datasets are rarely clean. CSV exports arrive with spaces, mixed casing, and cryptic abbreviations. Database queries produce columns like col_0 or Unnamed: 0. API responses nest fields under names that break downstream code. Before you can merge, group, or model your data, you need column names that are consistent, readable, and code-friendly.
Pandas provides multiple ways to rename DataFrame columns, each suited to different situations. Choosing the wrong method leads to subtle bugs -- renaming in place when you expected a copy, or triggering a ValueError because a list length didn't match. This guide covers every approach with working code, a comparison table, and real-world patterns you can copy directly.
Quick Answer
If you just need to rename one or a few columns, use rename() with a dictionary:
import pandas as pd
df = pd.DataFrame({'old_name': [1, 2, 3], 'another_col': [4, 5, 6]})
# Rename a single column
df = df.rename(columns={'old_name': 'new_name'})
# Rename multiple columns
df = df.rename(columns={'new_name': 'id', 'another_col': 'value'})For bulk transformations (lowercase, strip whitespace, add prefixes), use a list comprehension:
df.columns = [col.lower().strip().replace(' ', '_') for col in df.columns]Read on for all six methods, when to pick each one, and the common mistakes to avoid.
Method Comparison Table
| Method | Best For | Returns New DF | Handles Subset | Risk Level |
|---|---|---|---|---|
rename(columns={...}) | Renaming specific columns by name | Yes (default) | Yes | Low |
rename(columns=func) | Applying a function to all column names | Yes (default) | No | Low |
df.columns = [...] | Replacing all column names at once | No (in place) | No | Medium -- length must match |
set_axis(labels, axis=1) | Functional-style full replacement | Yes (default) | No | Medium -- length must match |
| List comprehension | Bulk string transformations | No (in place) | No | Low |
read_csv(names=...) | Setting column names during file import | N/A | N/A | Low |
Method 1: rename() with a Dictionary
The rename() method is the standard approach. It accepts a dictionary mapping old names to new names. Only the columns you specify are renamed; everything else stays the same.
Rename a Single Column
import pandas as pd
df = pd.DataFrame({
'customer_name': ['Alice', 'Bob', 'Charlie'],
'purchase_amt': [120.50, 89.99, 245.00],
'purchase_date': ['2026-01-15', '2026-01-16', '2026-01-17']
})
df = df.rename(columns={'purchase_amt': 'amount'})
print(df.columns.tolist())
# ['customer_name', 'amount', 'purchase_date']Rename Multiple Columns
Pass multiple key-value pairs in the dictionary:
df = df.rename(columns={
'customer_name': 'name',
'amount': 'total_usd',
'purchase_date': 'date'
})
print(df.columns.tolist())
# ['name', 'total_usd', 'date']The inplace Parameter
By default, rename() returns a new DataFrame. You can modify in place with inplace=True, but the pandas documentation recommends against it -- there is no performance benefit, and it makes code harder to reason about.
# Not recommended, but works
df.rename(columns={'name': 'full_name'}, inplace=True)
# Preferred: assign the result
df = df.rename(columns={'name': 'full_name'})Method 2: rename() with a Function
When you need to transform every column name using the same rule, pass a callable instead of a dictionary. This is useful for standardization tasks.
Convert to Lowercase
import pandas as pd
df = pd.DataFrame({'First Name': [1], 'Last Name': [2], 'Email Address': [3]})
df = df.rename(columns=str.lower)
print(df.columns.tolist())
# ['first name', 'last name', 'email address']Replace Spaces with Underscores
df = df.rename(columns=lambda x: x.replace(' ', '_'))
print(df.columns.tolist())
# ['first_name', 'last_name', 'email_address']Chain Multiple Transformations
df = df.rename(columns=lambda x: x.strip().lower().replace(' ', '_'))Use str.removeprefix() or str.removesuffix() (Python 3.9+)
import pandas as pd
df = pd.DataFrame({'col_revenue': [100], 'col_cost': [50], 'col_profit': [50]})
df = df.rename(columns=lambda x: x.removeprefix('col_'))
print(df.columns.tolist())
# ['revenue', 'cost', 'profit']Method 3: Assigning df.columns Directly
When you need to replace all column names at once and you know the full list, assign a new list or Index directly to df.columns. This modifies the DataFrame in place.
import pandas as pd
df = pd.DataFrame({'a': [1, 2], 'b': [3, 4], 'c': [5, 6]})
df.columns = ['x', 'y', 'z']
print(df)Output:
x y z
0 1 3 5
1 2 4 6The list length must exactly match the number of columns. If it doesn't, pandas raises a ValueError:
# This raises ValueError: Length mismatch
df.columns = ['x', 'y']Method 4: set_axis()
The set_axis() method works like direct assignment but returns a new DataFrame by default. This makes it suitable for method chaining.
import pandas as pd
df = pd.DataFrame({'a': [1, 2], 'b': [3, 4], 'c': [5, 6]})
df_renamed = df.set_axis(['x', 'y', 'z'], axis=1)
print(df_renamed.columns.tolist())
# ['x', 'y', 'z']Use set_axis() when you want a functional style without modifying the original:
result = (
df
.set_axis(['x', 'y', 'z'], axis=1)
.assign(total=lambda d: d['x'] + d['y'] + d['z'])
)Like direct assignment, the label list must match the column count exactly.
Method 5: List Comprehension for Bulk Transformations
List comprehensions give you full Python string processing power. They are the go-to choice for bulk cleanup tasks.
Lowercase and Replace Spaces
import pandas as pd
df = pd.DataFrame({'First Name': [1], 'Last Name': [2], ' Email ': [3]})
df.columns = [col.strip().lower().replace(' ', '_') for col in df.columns]
print(df.columns.tolist())
# ['first_name', 'last__name', 'email']Notice the double underscore from Last Name. Handle that with a regex:
import re
df.columns = [re.sub(r'\s+', '_', col.strip().lower()) for col in df.columns]
print(df.columns.tolist())
# ['first_name', 'last_name', 'email']Add a Prefix or Suffix
df.columns = [f"user_{col}" for col in df.columns]
# ['user_first_name', 'user_last_name', 'user_email']Remove a Common Prefix
df.columns = [col.removeprefix('user_') for col in df.columns]Conditional Renaming
df.columns = [
col.upper() if col.startswith('id') else col
for col in df.columns
]Method 6: Renaming Columns During File Import
When reading a CSV file, you can set column names at import time. This is useful when the file has no header row or when the existing headers are unusable.
import pandas as pd
# File has no header row
df = pd.read_csv('data.csv', header=None, names=['id', 'name', 'value'])
# File has a header row you want to replace
df = pd.read_csv('data.csv', header=0, names=['id', 'name', 'value'])The names parameter sets column names regardless of what is in the file. Combined with header=0, it reads the file, discards the original header, and applies your names.
For Excel files, the same approach works with read_excel():
df = pd.read_excel('report.xlsx', header=0, names=['date', 'region', 'revenue'])Real-World Example: Cleaning Messy CSV Headers
Datasets from external sources often have headers with inconsistent formatting. Here is a pattern that handles the most common problems in one pass.
import pandas as pd
import re
df = pd.read_csv('messy_data.csv')
print(df.columns.tolist())
# [' Customer ID ', 'First Name', 'last-name', 'Email Address', 'PHONE_NUMBER']
def clean_column_name(name):
"""Standardize a column name to snake_case."""
name = name.strip() # Remove leading/trailing whitespace
name = name.lower() # Convert to lowercase
name = re.sub(r'[\s\-]+', '_', name) # Replace spaces and hyphens with underscores
name = re.sub(r'[^a-z0-9_]', '', name) # Remove special characters
name = re.sub(r'_+', '_', name) # Collapse multiple underscores
return name
df.columns = [clean_column_name(col) for col in df.columns]
print(df.columns.tolist())
# ['customer_id', 'first_name', 'last_name', 'email_address', 'phone_number']This pattern works for virtually any messy header format. Save the clean_column_name function in a utility module and reuse it across projects.
Real-World Example: Standardizing Columns for ML Pipelines
Machine learning frameworks like scikit-learn are sensitive to column names. Spaces, special characters, and unicode in column names can cause errors in certain estimators and serialization formats.
import pandas as pd
import re
# Simulated raw feature DataFrame
df = pd.DataFrame({
'Age (years)': [25, 30],
'Income ($)': [50000, 75000],
'Has Degree?': [True, False],
'City/Region': ['NYC', 'LA']
})
# Clean for ML compatibility
df.columns = [
re.sub(r'[^a-z0-9]+', '_', col.lower()).strip('_')
for col in df.columns
]
print(df.columns.tolist())
# ['age_years', 'income', 'has_degree', 'city_region']After renaming, you can explore the cleaned DataFrame visually. PyGWalker (opens in a new tab) turns any pandas DataFrame into a Tableau-style interactive UI inside Jupyter notebooks. This is particularly helpful after column renaming, when you want to verify the structure of your data before feeding it into a model:
import pygwalker as pyg
walker = pyg.walk(df)No chart code is required -- drag columns onto axes and PyGWalker generates the visualization automatically.
Common Errors and How to Fix Them
KeyError: Column Name Not Found
This happens when the name in your rename() dictionary doesn't match any existing column. The mismatch is often caused by invisible whitespace or case differences.
import pandas as pd
df = pd.DataFrame({' Name ': [1], 'Age': [2]})
# This silently does nothing -- no error, no rename
df = df.rename(columns={'Name': 'name'})
print(df.columns.tolist())
# [' Name ', 'Age'] <-- unchanged because of spaces
# Fix: strip whitespace first
df.columns = df.columns.str.strip()
df = df.rename(columns={'Name': 'name'})To make rename() raise an error when a key is missing, use the errors parameter:
# Raises KeyError if 'nonexistent' is not a column
df = df.rename(columns={'nonexistent': 'new'}, errors='raise')ValueError: Length Mismatch
This occurs when assigning df.columns or using set_axis() with a list that has a different length than the current column count.
import pandas as pd
df = pd.DataFrame({'a': [1], 'b': [2], 'c': [3]})
# Raises: ValueError: Length mismatch: Expected axis has 3 elements
df.columns = ['x', 'y']The fix is straightforward -- make sure your list has the correct number of elements:
print(f"DataFrame has {len(df.columns)} columns")
df.columns = ['x', 'y', 'z'] # Correct: 3 elements for 3 columnsDuplicate Column Names After Renaming
Renaming can accidentally create duplicate column names. Pandas allows duplicates, but they cause ambiguous indexing.
import pandas as pd
df = pd.DataFrame({'a': [1], 'b': [2], 'c': [3]})
df = df.rename(columns={'b': 'a'}) # Now two columns named 'a'
print(df['a']) # Returns both columns as a DataFrame, not a SeriesCheck for duplicates after renaming:
if df.columns.duplicated().any():
print("Warning: duplicate column names detected")
print(df.columns[df.columns.duplicated()].tolist())Performance: Which Method is Fastest
For most DataFrames (under a few million rows), the performance difference between methods is negligible. The bottleneck is always the data operations, not the column renaming. But if you are renaming columns inside a loop or on thousands of DataFrames, here is how the methods compare.
import pandas as pd
import timeit
df = pd.DataFrame({f'col_{i}': range(100) for i in range(100)})
new_names = {f'col_{i}': f'new_{i}' for i in range(100)}
new_list = [f'new_{i}' for i in range(100)]
# Method: rename()
t1 = timeit.timeit(lambda: df.rename(columns=new_names), number=1000)
# Method: df.columns assignment
t2 = timeit.timeit(lambda: setattr(df, 'columns', new_list), number=1000)
# Method: set_axis()
t3 = timeit.timeit(lambda: df.set_axis(new_list, axis=1), number=1000)
print(f"rename(): {t1:.4f}s")
print(f"df.columns: {t2:.4f}s")
print(f"set_axis(): {t3:.4f}s")Typical results on a 100-column DataFrame:
| Method | Relative Speed | Notes |
|---|---|---|
df.columns = [...] | Fastest | Direct attribute assignment, no copy |
set_axis() | ~1.2x slower | Creates a new DataFrame |
rename(columns={...}) | ~2-3x slower | Dictionary lookup per column |
rename(columns=func) | ~2-3x slower | Function call per column |
For everyday use, pick the method that makes your code clearest. Optimize only if profiling shows column renaming is an actual bottleneck.
Best Practices
- Use
rename()for targeted renames. It is explicit, handles subsets, and returns a new DataFrame by default. - Use list comprehension for bulk cleanup. String operations like lowercasing, stripping whitespace, and replacing characters are cleaner in a comprehension than in a loop.
- Standardize early. Run column name cleanup immediately after loading data -- before any filtering, merging, or grouping. If you also need to drop unwanted columns or add new ones, do that in the same cleanup step.
- Avoid
inplace=True. It provides no performance advantage and makes debugging harder. - Check for duplicates. After renaming, verify with
df.columns.duplicated().any(). - Use snake_case. The convention
lower_case_with_underscoresworks well with Python attribute access (df.column_name) and avoids quoting issues.
Frequently Asked Questions
How do I rename a single column in pandas?
Use df.rename(columns={'old_name': 'new_name'}). This returns a new DataFrame with the specified column renamed. All other columns remain unchanged.
How do I rename all columns in a pandas DataFrame?
Assign a list directly: df.columns = ['col_a', 'col_b', 'col_c']. The list must have exactly the same number of elements as the DataFrame has columns. Alternatively, use df.set_axis(['col_a', 'col_b', 'col_c'], axis=1) to return a new DataFrame.
How do I make all column names lowercase in pandas?
Use a list comprehension: df.columns = [col.lower() for col in df.columns]. For additional cleanup like replacing spaces, extend it: df.columns = [col.lower().replace(' ', '_') for col in df.columns].
Can I rename columns during read_csv?
Yes. Use the names parameter: pd.read_csv('file.csv', header=0, names=['id', 'name', 'value']). Setting header=0 tells pandas to discard the existing header row and use your provided names instead.
What happens if I rename a column to a name that already exists?
Pandas allows duplicate column names, but they cause problems. Indexing with df['duplicate_name'] returns a DataFrame instead of a Series, which breaks most downstream operations. Always check for duplicates after renaming with df.columns.duplicated().any().
Conclusion
The rename() method handles most column renaming tasks -- pass a dictionary for specific columns or a function for bulk transformations. When you need to replace all column names at once, assign a list to df.columns or use set_axis(). For messy real-world headers, combine strip(), lower(), and re.sub() in a list comprehension to standardize everything in one pass.
Pick the method that matches your situation, standardize column names early in your pipeline, and the rest of your analysis code becomes cleaner and less error-prone. Once columns are renamed, you can reorder them to match your preferred layout.
Related Guides
- Drop Columns from a DataFrame -- remove unwanted columns after renaming
- Add a Column to a DataFrame -- add new columns alongside renamed ones
- Reorder DataFrame Columns -- rearrange column order after renaming
- Pandas Merge DataFrames -- merge requires matching column names across DataFrames
- Read CSV Files in Pandas -- set column names at import time with
names=