Skip to content

How to Fix "utf8_encode and utf8_decode Deprecated in PHP 8.2"

As a long-time PHP developer and encoding specialist, I was surprised to hear that the popular utf8_encode() and utf8_decode() functions will now trigger deprecation warnings in PHP 8.2.

These functions have been around for ages, so why the sudden axe?

In this comprehensive guide, I‘ll cover:

  • The history and original intent behind the functions
  • A quick dive into character encoding basics
  • The reasoning and decision to deprecate
  • How to view the deprecation warnings
  • 3 alternative solutions with examples
  • Encoding conversion cheat sheet
  • Common troubleshooting tips

I‘ll also infuse my personal experiences and recommendations as an encoding expert throughout.

A Look Back at the History of These Functions

First, let‘s rewind and understand why utf8_encode() and utf8_decode() came about…

Back in PHP 4 era (the stone ages!), built-in encoding functionality was extremely limited. Developers were dealing with bugs due to mismatches between Latin-1, ISO 8859-1 and early forms of UTF-8.

To solve this, the original author of PHP created these two functions as a quick bandaid solution to convert between the most common encodings of the time.

Over the years they became relied upon by PHP developers to handle general UTF-8 conversion needs despite only supporting ISO-8859-1 and UTF-8.

Character Encoding Crash Course

Before we look at the deprecation decision, let‘s review some encoding fundamentals…

Character encoding is the process of mapping characters to numeric values that can be stored and transmitted across platforms and languages. This allows text to be represented digitally.

[diagram showing mapping of letter to numbers]

Common types like ASCII, ISO 8859-1, UTF-8 and UTF-16 have their own mapping systems. When we convert between encodings, we simply re-map the numeric values.

Mixing encodings without proper conversion causes missing, corrupted, or unintended characters in text – essentially encoding errors.

These two functions aim to solve this by converting between ISO-8859-1 and UTF-8 specifically.

The Deprecation Decision – Why Now?

The PHP core contributors decided deprecation was needed due to a few key factors:

Limited Scope

The functions only handle ISO-8859-1 and UTF-8, not the 200+ encodings in use today.

Silent Failures

They fail silently without warnings if input string does not match expected encoding…

Inappropriate Names

Their names falsely imply ability to handle general UTF-8 conversion.

After 15+ years, better alternatives now exist that justify removing these outdated functions in PHP 8.2.

Developers attempting to use them will now see deprecation warnings.

PHP Throws Warnings Now

With the knowledge of why they are going away, let‘s look at what warnings will appear if you call them:

$utf8 = utf8_encode(‘Crème brûlée‘); 

// Warning: utf8_encode() is deprecated 

And decoding:

$text = utf8_decode($utf8);

// Warning: utf8_decode() is deprecated

The warnings are loud and clear!

Solution 1: mb_convert_encoding()

The recommended alternative is the mb_convert_encoding() function. It works the same way but supports over 200 encodings:

$utf8 = mb_convert_encoding(‘Crème brûlée‘, ‘UTF-8‘, ‘ISO-8859-1‘);

$latin1 = mb_convert_encoding($utf8, ‘ISO-8859-1‘, ‘UTF-8‘); 

By explicitly setting input and output encoding expectations, characters stay true.

Pros:

  • Fully featured
  • Built-in PHP function
  • Actively maintained
  • Handles over 200 character encodings!

Cons:

  • Slightly more verbose

Solution 2: Intl Extension

The Intl extension contains Intl::convert() with similar encoding conversion capabilities:

$utf8 = Intl::convert(‘Crème brûlée‘, ‘ISO-8859-1‘, ‘UTF-8‘); 

$latin1 = Intl::convert($text, ‘UTF-8‘, ‘ISO-8859-1‘);

Pros:

  • Clean interface
  • Part of Internationalization extension suite

Cons:

  • Extension needs installed
  • Slightly less encoding support

Solution 3: iconv()

The iconv() function has been around since PHP 4.2 with wide support for encodings:

$utf8 = iconv(‘ISO-8859-1‘, ‘UTF-8‘, ‘Crème brûlée‘);  

$latin1 = iconv(‘UTF-8‘, ‘ISO-8859-1‘, $utf8);   

Pros:

  • Lightweight
  • Long history

Cons:

  • Less intuitive order of encodings
  • Limited error handling

Encoding Conversion Cheat Sheet

Here is a quick reference for converting ISO-8859-1/Latin-1 and UTF-8 with all three alternatives:

Function Latin1 > UTF8 UTF8 > Latin1
mb_convert_encoding mb_convert_encoding($str, ‘UTF-8‘, ‘ISO-8859-1‘); mb_convert_encoding($str, ‘ISO-8859-1‘, ‘UTF-8‘);
Intl::convert Intl::convert($str, ‘ISO-8859-1‘, ‘UTF-8‘); Intl::convert($str, ‘UTF-8‘, ‘ISO-8859-1‘);
iconv iconv(‘ISO-8859-1‘, ‘UTF-8‘, $str); iconv(‘UTF-8‘, ‘ISO-8859-1‘, $str);

Troubleshooting Tips

When converting encodings, here are some common issues and solutions:

Problem: Warning message about invalid characters

Solution: The input string does not match the expected encoding type. Double check your source encoding using mb_detect_encoding().

Problem: Output contains ? or � replacement characters

Solution: This normally means unmapped characters in target encoding. Try a different encoding pair.

Problem: No warnings but scrambled text

Solution: One of the encodings is incorrect. Print or log encodings before and after to test.

Closing Thoughts

I‘m hopeful this article has prepped you to upgrade your own code! To recap:

  • utf8_encode() and utf8_decode() served their purpose long ago but were limited in scope
  • The core PHP team deprecated them to promote more robust alternatives
  • Always double check your input encodings when doing conversions
  • Leverage mb_convert_encoding() for future-proof needs!

Contact me with any other encoding hot takes or questions. Happy coding!