How to Detect Character Encoding in PHP

09/06/2021

Contents

In this article, you will learn how to detect character encoding in PHP.

PHP mb_detect_encoding() Function

In PHP, you can use the mb_detect_encoding() function to detect the character encoding of a string. This function takes a string as an argument and returns the name of the encoding. The optional second argument can be used to specify a list of encodings to try, in order of priority.

Example:

<?php
  $string = "some text";
  $encoding = mb_detect_encoding($string);

  echo "Encoding: $encoding";
?>

Note that this function uses the “mbstring” extension, so you may need to install and enable it if it is not already installed.

The mb_detect_encoding() function uses a heuristic approach to detect the character encoding of a string. It examines the input string and tries to match its character patterns to a known character encoding.

It is important to note that the function is not always accurate, especially when the input string is not properly formatted or contains characters from multiple encodings. In such cases, you may need to manually specify the list of encodings to try or use a different method to detect the encoding.

The mbstring extension also provides a set of functions for handling multibyte characters, including converting between encodings, searching and replacing within strings, and more. These functions can be useful for working with text in different character encodings.

Here is an example of how you can specify a list of encodings to try:

<?php
  $string = "some text";
  $encodings_to_try = array('UTF-8', 'ISO-8859-1', 'Windows-1251');
  $encoding = mb_detect_encoding($string, $encodings_to_try);

  echo "Encoding: $encoding";
?>

In this example, the function will first try to detect the encoding as UTF-8, then ISO-8859-1, and finally Windows-1251. The first encoding that matches the input string will be returned.