Excel Data Cleanup: Standardising Student Names for VBA Automation

By Andyb 10 min read 1,924 words
vba data cleanup standardize names excel data preprocessing excel automation vba trim function proper case vba
Why your VBA automation breaks with inconsistent names—and exactly how to fix it. Complete guide to cleaning and standardising student data in Excel before automation.

This is Post 2 in our 4-part series on Automating Student Feedback Processing with VBA. In Post 1, we introduced a real-world case study where a teacher needed to process 100 student surveys. Today, we're diving into Step 1: why data cleanup is critical before automation—and exactly how to do it.



Why Your VBA Automation Breaks: The "Sarah Smith" Problem

Imagine you've written VBA code to create separate worksheets for each student. You run the macro, and instead of 100 sheets (one per student), you get 127 sheets. What happened?

The answer lies in your source data. Here's what Excel sees when student names aren't standardised:

How the Name Appears What Excel Thinks Sarah Smith Unique student #1 sarah smith Unique student #2 (different case) Sarah  Smith Unique student #3 (extra space)  Sarah Smith Unique student #4 (leading space) Smith, Sarah Unique student #5 (different order)

To a human, these are all the same person. To Excel, they're five different students—which means your automation creates five separate worksheets instead of one.

"I thought I had 100 students, but my macro created 127 sheets. It took me an hour to realise the problem was inconsistent spacing and capitalisation in my source data."

Real-World Data Chaos: Where Inconsistencies Come From

In the student feedback case study, name inconsistencies arose from multiple sources:

  • Manual entry: Students typing their own names and peers' names (typos, different formats)
  • Copy-paste errors: Accidentally including extra spaces before/after names
  • Export formatting: Data exported from survey tools with inconsistent capitalisation
  • Multiple contributors: Different teaching assistants entering data with different conventions

This isn't unique to education—every dataset collected from multiple sources or manual entry suffers from the same problem. Customer names, product categories, client lists, regional data—all prone to inconsistency.



The Solution: Standardise Student Names Excel VBA

Before running any automation, we need to clean and standardise the data. This means:

  1. Consistent capitalisation: Convert all names to Proper Case (First Letter Uppercase)
  2. Remove whitespace: Trim leading, trailing, and excess internal spaces
  3. Handle edge cases: Deal with hyphenated names, titles (Dr., Mr.), and special characters

The VBA Approach: Three Key Functions

VBA provides three essential text functions for data cleanup:

VBA Function What It Does Example Trim() Removes leading & trailing spaces " Sarah " → "Sarah" Proper() Converts to Proper Case "sarah smith" → "Sarah Smith" Replace() Replaces text patterns "Sarah  Smith" → "Sarah Smith"

Complete VBA Code: Clean Student Names Before Automation

Here's the complete VBA macro to clean and standardise student names in Excel. This code is designed to run before your main automation (sheet creation, chart generation, etc.).

Sub StandardiseStudentNames()
    ' Clean and standardise student names for VBA automation
    ' vbacode.io - Data Cleanup Template

    Dim ws As Worksheet
    Dim lastRow As Long
    Dim cell As Range
    Dim nameColumn As Long
    Dim cleanedName As String

    ' Configuration
    Set ws = ActiveSheet
    nameColumn = 1  ' Column A contains names - adjust if needed

    ' Error handling
    On Error GoTo ErrorHandler

    Application.ScreenUpdating = False

    ' Find last row with data
    lastRow = ws.Cells(ws.Rows.Count, nameColumn).End(xlUp).Row

    ' Validate data exists
    If lastRow < 2 Then
        MsgBox "No data found. Make sure your data starts in row 2.", vbExclamation
        GoTo CleanUp
    End If

    ' Clean each name
    For Each cell In ws.Range(ws.Cells(2, nameColumn), ws.Cells(lastRow, nameColumn))
        If Not IsEmpty(cell.Value) Then
            cleanedName = cell.Value

            ' Step 1: Trim leading/trailing whitespace
            cleanedName = Trim(cleanedName)

            ' Step 2: Convert to Proper Case
            cleanedName = Application.WorksheetFunction.Proper(cleanedName)

            ' Step 3: Remove multiple internal spaces
            Do While InStr(cleanedName, "  ") > 0
                cleanedName = Replace(cleanedName, "  ", " ")
            Loop

            ' Step 4: Handle common edge cases
            cleanedName = FixHyphenatedNames(cleanedName)
            cleanedName = FixApostrophes(cleanedName)

            ' Update cell with cleaned name
            cell.Value = cleanedName
        End If
    Next cell

    ' Success message
    MsgBox "Data cleanup complete!" & vbCrLf & vbCrLf & _
           "Processed " & (lastRow - 1) & " names in column " & Chr(64 + nameColumn) & ".", vbInformation

CleanUp:
    Application.ScreenUpdating = True
    Exit Sub

ErrorHandler:
    Application.ScreenUpdating = True
    MsgBox "Error during cleanup: " & Err.Description, vbCritical
End Sub

' Helper function: Fix hyphenated names (e.g., "Mary-jane" → "Mary-Jane")
Function FixHyphenatedNames(inputName As String) As String
    Dim parts() As String
    Dim i As Integer
    Dim result As String

    If InStr(inputName, "-") > 0 Then
        parts = Split(inputName, "-")
        For i = LBound(parts) To UBound(parts)
            parts(i) = Application.WorksheetFunction.Proper(parts(i))
        Next i
        result = Join(parts, "-")
    Else
        result = inputName
    End If

    FixHyphenatedNames = result
End Function

' Helper function: Fix apostrophes (e.g., "O'brien" → "O'Brien")
Function FixApostrophes(inputName As String) As String
    Dim parts() As String
    Dim i As Integer
    Dim result As String

    If InStr(inputName, "'") > 0 Then
        parts = Split(inputName, "'")
        For i = LBound(parts) To UBound(parts)
            parts(i) = Application.WorksheetFunction.Proper(parts(i))
        Next i
        result = Join(parts, "'")
    Else
        result = inputName
    End If

    FixApostrophes = result
End Function


How to Use This Code: Step-by-Step Setup Instructions

Step 1: Open the VBA Editor

  1. Open your Excel workbook with student data
  2. Press Alt + F11 (Windows) or Fn + Option + F11 (Mac)
  3. This opens the Visual Basic Editor

Step 2: Insert a New Module

  1. In the VBA Editor, click Insert → Module
  2. A blank code window appears
  3. Copy the entire code above and paste it into this window

Step 3: Configure for Your Data

Find this line in the code:

nameColumn = 1  ' Column A contains names - adjust if needed

Change the number if your student names are in a different column:

  • nameColumn = 1 for Column A
  • nameColumn = 2 for Column B
  • nameColumn = 3 for Column C
  • And so on...

Step 4: Run the Macro

  1. Close the VBA Editor (or keep it open)
  2. Back in Excel, press Alt + F8 (Windows) or Fn + Option + F8 (Mac)
  3. Select StandardiseStudentNames from the list
  4. Click Run

The macro runs in seconds and displays a success message showing how many names were processed.



Understanding the Code: What Each Step Does

Step 1: Trim Whitespace

cleanedName = Trim(cleanedName)

This removes invisible spaces before and after the name:

  • "  Sarah Smith " → "Sarah Smith"
  • Fixes copy-paste errors from other applications

Step 2: Convert to Proper Case

cleanedName = Application.WorksheetFunction.Proper(cleanedName)

This capitalises the first letter of each word:

  • "sarah smith" → "Sarah Smith"
  • "SARAH SMITH" → "Sarah Smith"
  • "sArAh SmItH" → "Sarah Smith"

Step 3: Remove Multiple Internal Spaces

Do While InStr(cleanedName, "  ") > 0
    cleanedName = Replace(cleanedName, "  ", " ")
Loop

This collapses multiple spaces between words into single spaces:

  • "Sarah   Smith" → "Sarah Smith"
  • Handles data entry errors where users pressed spacebar multiple times

Step 4: Handle Edge Cases

The helper functions fix two common problems:

Hyphenated names:

  • "mary-jane wilson" → "Mary-Jane Wilson"
  • The Proper() function alone would produce "Mary-jane Wilson" (incorrect)
  • Our function splits on hyphens, applies Proper Case to each part, then rejoins

Apostrophes in names:

  • "o'brien" → "O'Brien"
  • "d'angelo" → "D'Angelo"
  • Same split-and-rejoin approach as hyphenated names


Common Edge Cases: What About...?

Titles (Dr., Mr., Ms.)

Problem: "dr. sarah smith" becomes "Dr. Sarah Smith" but should you remove titles?

Solution: It depends on your use case. If titles matter (faculty list), keep them. If not (student roster), add this code:

cleanedName = Replace(cleanedName, "Dr. ", "")
cleanedName = Replace(cleanedName, "Mr. ", "")
cleanedName = Replace(cleanedName, "Ms. ", "")
cleanedName = Replace(cleanedName, "Mrs. ", "")

Suffixes (Jr., Sr., III)

Problem: "John Smith Jr." vs "John Smith Jr" (with/without period)

Solution: Standardise suffix format:

cleanedName = Replace(cleanedName, " Jr", " Jr.")
cleanedName = Replace(cleanedName, " Sr", " Sr.")
cleanedName = Replace(cleanedName, " Jr..", " Jr.")  ' Fix double periods

Last Name, First Name Format

Problem: Some rows have "Smith, Sarah" and others have "Sarah Smith"

Solution: Decide on one format and convert all names:

' Convert "Last, First" to "First Last"
If InStr(cleanedName, ",") > 0 Then
    Dim nameParts() As String
    nameParts = Split(cleanedName, ",")
    cleanedName = Trim(nameParts(1)) & " " & Trim(nameParts(0))
End If

Non-English Characters (Accents, Symbols)

Problem: Names like "José" or "François" may not standardise correctly

Solution: VBA's Proper() function handles most accented characters correctly. If you encounter issues, manually review these names after running the macro.



Before & After: Real Data Transformation

Here's what the macro did for the student feedback dataset:

Before Cleanup After Cleanup  sarah smith  Sarah Smith MICHAEL  JOHNSON Michael Johnson mary-jane wilson Mary-Jane Wilson O'BRIEN, PATRICK Patrick O'Brien emma rodriguez Emma Rodriguez

Result: 127 "unique" names reduced to 100 actual students—allowing the sheet creation automation to work perfectly.



Why Clean Data Before Automation?

You might be tempted to skip data cleanup and fix automation issues as they arise. Here's why that's a mistake:

1. Garbage In, Garbage Out

If your source data has 127 variations of 100 names, your automation will create 127 worksheets. You'll then spend time:

  • Manually merging duplicate sheets
  • Copying data between sheets
  • Deleting incorrect sheets
  • Wasting more time than if you'd just cleaned the data first

2. Error-Prone Automation

Inconsistent data causes automation to fail unpredictably:

  • Some students get multiple sheets (confusing)
  • Identical names might be treated differently (inconsistent)
  • Your code becomes filled with workarounds and patches (unmaintainable)

3. Future-Proofing

Clean data once, benefit forever:

  • Next quarter's survey data arrives → clean it the same way
  • Annual review data → use the same cleanup macro
  • New task with similar data structure → adapt the existing code


Download the Free Template: Data Processing & Cleanup

Don't want to copy-paste code? We've packaged this exact macro into a free VBA template you can download and use immediately.

📥 Free Template: Data Processing & Cleanup

What's included:

  • Complete VBA code for name standardisation
  • Edge case handling (hyphens, apostrophes, titles)
  • Configurable column selection
  • User-friendly error messages

How to use:

  1. Download the template (no signup required)
  2. Open your Excel file with student/client/category data
  3. Run the macro, select your column, done

Download Free Template →



What's Next: Creating Separate Sheets for Each Student

Now that your data is clean and standardised, you're ready for Step 2: Creating separate worksheets for each student.

In Post 3 (coming Tuesday), we'll cover:

  • How to loop through unique values in a column efficiently
  • Creating and naming worksheets programmatically (with Excel's 31-character limit)
  • Filtering and copying data with robust error handling
  • Free template: Create Sheet Per Unique Value (already live!)

This is the highest-value automation step—turning hours of manual copy-paste work into 30 seconds of automated processing.



Need Custom Code? Generate VBA Tailored to Your Data

The templates in this series work great if your data structure matches closely. But what if you need:

  • Cleanup for multiple columns (first name + last name + email)
  • Additional validation (check for missing data, flag duplicates)
  • Integration with your existing macros
  • Custom edge cases specific to your industry

🤖 Generate Custom VBA Code

Describe your specific requirements in plain English, and our AI generates VBA code tailored exactly to your needs.

Free tier: 3 generations per month to test and iterate
Standard plan (€15/month): 75 generations + 5 code reviews when it doesn't work perfectly
Pro plan (€30/month): 250 generations + unlimited code reviews + o1-mini AI model for complex VBA

Try Free Code Generation →

Coming Tuesday: Post 3 - Creating Separate Excel Sheets for Each Student with VBA

This is the post you've been waiting for—the automation that saves the most time. Don't miss it! Subscribe below to get notified.

🚀 Want More VBA Tips Like This?

Join 1,000+ Excel professionals getting weekly VBA tutorials, free code templates, and automation strategies delivered to their inbox.

Free forever
No spam, ever
Unsubscribe anytime

Or generate VBA code instantly:

Try vbacode.io