PYTHON

A Guide to Modern Python String Formatting Tools – Real Python

Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: Formatting Python Strings

When working with strings in Python, you may need to interpolate values into your string and format these values to create new strings dynamically. In modern Python, you have f-strings and the .format() method to approach the tasks of interpolating and formatting strings.

To get the most out of this tutorial, you should know the basics of Python programming and the string data type.

Take the Quiz: Test your knowledge with our interactive “A Guide to Modern Python String Formatting Tools” quiz. You’ll receive a score upon completion to help you track your learning progress:


Getting to Know String Interpolation and Formatting in Python

Python has developed different string interpolation and formatting tools over the years. If you’re getting started with Python and looking for a quick way to format your strings, then you should use Python’s f-strings.

If you need to work with older versions of Python or legacy code, then it’s a good idea to learn about the other formatting tools, such as the .format() method.

In this tutorial, you’ll learn how to format your strings using f-strings and the .format() method. You’ll start with f-strings to kick things off, which are quite popular in modern Python code.

Using F-Strings for String Interpolation

Python has a string formatting tool called f-strings, which stands for formatted string literals. F-strings are string literals that you can create by prepending an f or F to the literal. They allow you to do string interpolation and formatting by inserting variables or expressions directly into the literal.

Creating F-String Literals

Here you’ll take a look at how you can create an f-string by prepending the string literal with an f or F:

Using either f or F has the same effect. However, it’s a more common practice to use a lowercase f to create f-strings.

Just like with regular string literals, you can use single, double, or triple quotes to define an f-string:

Up to this point, your f-strings look pretty much the same as regular strings. However, if you create f-strings like those in the examples above, you’ll get complaints from your code linter if you have one.

The remarkable feature of f-strings is that you can embed Python variables or expressions directly inside them. To insert the variable or expression, you must use a replacement field, which you create using a pair of curly braces.

Interpolating Variables Into F-Strings

The variable that you insert in a replacement field is evaluated and converted to its string representation. The result is interpolated into the original string at the replacement field’s location:

In this example, you’ve interpolated the site variable into your string. Note that Python treats anything outside the curly braces as a regular string.

Also, keep in mind that Python retrieves the value of site when it runs the string literal, so if site isn’t defined at that time, then you get a NameError exception. Therefore, f-strings are appropriate for eager string interpolation.

Now that you’ve learned how to embed a variable into an f-string, you can look into embedding Python expressions into your f-string literals.

Embedding Expressions in F-Strings

You can embed almost any Python expression in an f-string, including arithmetic, Boolean, and conditional expressions. You can also include function calls, attribute access, common sequence operations like indexing and slicing, and more.

Here’s an example that uses an arithmetic operation:

The expressions that you embed in an f-string can be almost arbitrarily complex. The examples below show some of the possibilities.

You can also do indexing and slicing on sequences and look up keys in dictionaries:

In this example, the first two embedded expressions run indexing and slicing operations in a list. The last expression runs a dictionary key lookup.

To include curly braces in an f-string, you need to escape it by doubling it:

In this example, the outer pair of curly braces tells Python to treat the inner braces as literal characters rather than part of the expression so they can appear in the resulting string.

Using the .format() Method for String Interpolation

In many ways, the Python string .format() method is similar to the older string modulo operator, but .format() goes well beyond it in terms of versatility. The general form of a .format() call is shown below:

You typically call the method on a string template, which is a string containing replacement fields. The *args and **kwargs arguments allow you to specify the values to insert into the template. The resulting string is returned from the method.

In the template string, replacement fields are enclosed in curly braces ({}). Anything outside of the curly braces is literal text that’s copied directly from the template to the output.

Positional Arguments With Manual Field Specification

To interpolate values into a string template using .format(), you can use positional arguments in the method call. You can then use integer indices to determine which replacement field to insert each value into:

In this example, template_string is the string "{0} {1} cost ${2}", which includes three replacement fields. The replacement fields {0}, {1}, and {2} contain numbers that correspond to the zero-based positional arguments 6, "bananas", and 1.74 * 6. Each positional argument is inserted into the template according to its index.

The following diagram shows the complete process:

Python string format, positional parameters
Using the String .format() Method With Positional Arguments and Indices

The arguments to .format() are inserted into the string template in the corresponding position. The first argument goes into the replacement field with index 0, the second argument goes into the replacement field with index 1, and so on.

It’s important to note that the indices don’t have to follow a strict consecutive order or be unique in the template. This allows you to customize the position of each argument in the final string.

Here’s a toy example:

When you specify a replacement field number that’s out of range, you’ll get an error. In the following example, the positional arguments are numbered 0, 1, and 2, but you specify {3} in the template:

This call to .format() raises an IndexError exception because index 3 is out of range.

Positional Arguments With Automatic Field Numbering

You can also omit the indices in the replacement fields, in which case Python will assume a sequential order. This is referred to as automatic field numbering:

In this example, you’ve removed the indices from your template. In this situation, Python inserts every argument into the replacement field following the same order you used in the call to .format().

When you specify automatic field numbering, you must provide at least as many arguments as there are replacement fields.

Here’s a toy example with four replacement fields and only three arguments:

In this example, you have four replacement fields in the template but only three arguments in the call to .format(). So, you get an IndexError exception.

Finally, it’s fine if the arguments outnumber the replacement fields. The excess arguments aren’t used:

Here, Python ignores the "baz" argument and builds the final string using only "foo" and "bar".

Note that you can’t mix these two techniques:

When you use Python to format strings with positional arguments, you must choose between either automatic or explicit replacement field numbering.

Keyword Arguments and Named Fields

You can also use keyword arguments instead of positional argument to produce the same result:

In this case, the replacement fields are {quantity}, {item}, and {cost}. These fields specify keywords corresponding to the keyword arguments quantity=6, item="bananas", and cost=1.74 * 6. Each keyword value is inserted into the template in place of its corresponding replacement field by name.

Keyword arguments are inserted into the template string in place of keyword replacement fields with the same name:

In this example, the values of the keyword arguments x, y, and z take the place of the replacement fields {x}, {y}, and {z}, respectively.

If you refer to a keyword argument that’s missing, then you’ll get an error:

In this example, you specify the {w} replacement field, but no corresponding keyword argument is named w in the call to .format(), so Python raises a KeyError exception.

You can specify keyword arguments in any arbitrary order:

In the first example, the replacement fields are in alphabetical order and the arguments aren’t. In the second example, it’s the other way around.

You can specify positional and keyword arguments in one .format() call. In this case, all of the positional arguments must appear before any of the keyword arguments:

The requirement that all positional arguments appear before any keyword arguments doesn’t only apply to the .format() method. This is generally true for any function or method call in Python.

In all the examples so far, the values you passed to .format() have been literal values, but you can specify variables as well:

In this example, you pass the variables x and y as positional arguments and z as a keyword argument.

Doing Formatted Interpolation: Components of a Replacement Field

Now that you know the basics of how to interpolate values into your strings using f-strings or .format(), you’re ready to learn about formatting.

When you call Python’s .format() method, the template string contains replacement fields. A replacement field consists of three components.

Here’s the BNF notation for the replacement fields syntax:

The three components are interpreted as shown in the table below:

Component Description
field_name Specifies the source of the value to be formatted
conversion Indicates which standard Python function to use to perform the type conversion
format_spec Specifies the format specifier to use when formatting the input value

Each component is optional and may be omitted. The field_name component can be a name or an index as you’ve already learned.

F-strings also have replacement fields. Their syntax is similar:

As shown here, f-strings have up to four components. The interpretation is mostly the same as with the .format() method. However, in an f-string, the f_expression component can hold a variable or expression. The equal sign (=) is optional and allows you to create self-documenting strings.

Up to this point, you’ve coded examples that show how to use the f_expression component in f-strings and the field_name component in .format(). In the following sections, you’ll learn about the other two components, which work similarly in f-strings and .format().

The conversion Component

The conversion component defines the function to use when converting the input value into a string. Python can do this conversion using built-in functions like the following:

  • str() provides a user-friendly string representation.
  • repr() provides a developer-friendly string representation.

By default, both f-strings and the .format() method use str(). However, in some situations, you may want to use repr(). You can do this with the conversion component of a replacement field.

The possible values for conversion are shown in the table below:

Value Function
!s str() – the default
!r repr()

To illustrate the difference between these two values, consider the following Person class:

This class implements the special methods .__str__() and .__repr__(), which internally support the built-in str() and repr() functions.

Now consider how this class works in the context of f-strings and the .format() method:

When you use the !s value for the conversion component, you get the user-friendly string representation of the interpolated object. Similarly, when you use the !r value, you get the developer-friendly string representation.

The format_spec Component

The format_spec component is the last portion of a replacement field. This component represents the guts of Python’s string formatting functionality. It contains information that exerts fine control over how to format the input values before inserting them into the template string.

The BNF notation that describes the syntax of this component is shown below:

The components of format_spec are listed in order in the following table:

Component Description Possible Values
fill Specifies the character to use for padding values that don’t occupy the entire field width Any character
align Specifies how to justify values that don’t occupy the entire field width <, >, =, or ^
sign Controls whether a leading sign is included for numeric values +, -, or a space
z Coerces negative zeros z
# Selects an alternate output form for certain presentation types, such as integers #
0 Causes values to be padded on the left with zeros instead of ASCII space characters 0
width Specifies the minimum width of the output Integer value
grouping_option Specifies a grouping character for numeric output _ or ,
precision Specifies the number of digits after the decimal point for floating-point presentation types, and the maximum output width for string presentations types Integer value
type Specifies the presentation type, which is the type of conversion performed on the corresponding argument b, c, d, e, E, f, F, g, G, n, o, s, x, X, or %

In the following section, you’ll learn how these components work in practice and how you can use them to format your strings either with f-string literals or with the .format() method.

Format Specifiers and Their Components

In practice, when you’re creating format specifiers to format the values that you interpolate into your strings, you can use different components according to your specific needs. In the following sections, you’ll learn about the format specifier components and how to use them.

The type Component

To kick things off, you’ll start with the type component, which is the final portion of a format_spec. The type component specifies the presentation type, which is the type of conversion that’s performed on the corresponding value to produce the output.

The possible values for type are described below:

Value Presentation Type
b Binary integer
c Unicode character
d Decimal integer
e or E Exponential
f or F Floating-point number
g or G Floating-point or exponential number
n Decimal integer
o Octal integer
s String
x or X Hexadecimal integer
% Percentage value

The first presentation type you have is b, which designates binary integer conversion:

In these examples, you use the b conversion type to represent the decimal number 257 as a binary number.

The c presentation type allows you to convert an input integer into its associated Unicode character:

As shown above, you can convert a given integer value into its associated Unicode character with the c presentation type. Note that you can use the built-in chr() function to confirm the conversion.

The g conversion type chooses either floating-point or exponential output, depending on the magnitude of the exponent:

The exact rules governing the choice might seem slightly complicated. Generally, you can trust that the choice will make sense.

The G conversion type is identical to g except for when the output is exponential, in which case the "E" will be displayed in uppercase:

The result is the same as in the previous example, but this time with an uppercase "E".

You’ll find a couple of other situations where you’ll see a difference between the g and G presentation types. For example, under some circumstances, a floating-point operation can result in a value that’s essentially infinite. The string representation of such a number in Python is "inf".

A floating-point operation may also produce a value that can’t be represented as a number. Python represents this value with the string "NaN", which stands for Not a Number.

When you pass these values to an f-string or the .format() method, the g presentation type produces lowercase output, and G produces uppercase output:

You’ll see similar behavior with the f and F presentation types. For more information on floating-point representation, inf, and NaN, check out the Wikipedia page on IEEE 754.

To learn more about other representation types, take a look at the Converting Between Type Representations section of Python’s Format Mini-Language for Tidy Strings.

The width Component

The width component specifies the minimum width of the output field:

Note that this is a minimum field width. Suppose you specify a value that’s longer than the minimum:

In this example, width is effectively ignored and the final string displays the input value as is.

The align and fill Components

The align and fill components allow you to control how the formatted output is padded and positioned within the specified field width. These components only make a difference when the input value doesn’t occupy the entire field width, which can only happen if a minimum field width is specified. If width isn’t specified, then align and fill are effectively ignored.

Here are the possible values for the align subcomponent:

Value Description
< Aligns the value to the left
> Aligns the value to the right
^ Centers the value
= Aligns the sign of numeric values

A format specifier that uses the less than sign (<) indicates that the output will be left-justified:

Aligning the value to the left is the default behavior with strings like "Hi".

A format specifier that uses the greater than sign (>) indicates that the output will be right-justified:

Aligning to the right is the default behavior for numeric values like 123.

A format specifier that uses a caret (^) indicates that the output will be centered in the output field:

With the caret character, you can center the input value in the output field.

Finally, you can specify a value for the align component using the equal sign (=) . This sign only has meaning for numeric values with signs included. When numeric output includes a sign, it’s normally placed directly to the left of the first digit in the number:

In these examples, you’ve used the sign component, which you’ll learn about in detail in the next section.

If you set align to the equal sign (=), then the sign appears at the left of the output field:

As you can see, the sign now appears on the left and the padding is added in between the sign and the number.

The fill component allows you to replace the extra space when the input value doesn’t completely fill the output width. It can be any character except for curly braces ({}).

Some examples of using fill are shown below:

Keep in mind that if you specify a value for fill, then you should also include a value for align.

The sign Component

You can control whether a sign appears in numeric output with the sign component of your format specifiers. In the following example, the plus sign (+) indicates that the value should always display a leading sign:

In these examples, you use the plus sign to always include a leading sign for both positive and negative values.

If you use the minus sign (-), then only negative numeric values will include a leading sign:

The - sign lets you display the sign when the input value is negative. If the input value is positive, then no sign is displayed.

Finally, you can also use a space (" ") for the sign component. A space means that a sign is included for negative values and a space character for positive values. To make this behavior evident in the example below, you use an asterisk as the fill character:

As you can see in these examples, positive values include a space rather than a plus sign. On the other hand, negative values include the actual minus sign.

The # Component

When you include a hash character (#) in the format_spec component, Python will select an alternate output form for certain presentation types. For binary (b), octal (o), and hexadecimal (x) presentation types, the hash character causes the inclusion of an explicit base indicator to the left of the value:

The base indicators are 0b, 0o, and 0x for binary, octal, and hexadecimal representations, respectively.

For floating-point (f or F) and exponential (e or E) presentation types, the hash character forces the output to contain a decimal point, even if the input consists of a whole number:

For any presentation type other than those covered above, the hash character (#) has no effect.

The 0 Component

If the output is smaller than the indicated field width and you start the format_spec component with a zero (0), then the input value will be padded on the left with zeros instead of space characters:

You’ll typically use the 0 component for numeric values, as shown above. However, it works for string values as well:

If you specify both fill and align, then fill overrides the 0 component:

The fill and 0 components essentially control the same thing, so there isn’t any need to specify both at the same time. In practice, 0 is superfluous and was probably included as a convenience for developers who are familiar with the string modulo operator’s similar 0 conversion flag.

The grouping_option Component

The grouping_option component allows you to include a grouping separator character in numeric outputs. For decimal and floating-point presentation types, grouping_option may be either a comma (,) or an underscore (_). That character then separates each group of three digits in the output:

In these examples, you’ve used a comma and an underscore as thousand separators for integer and floating-point values.

Setting the grouping_option component to an underscore (_) may also be useful with the binary, octal, and hexadecimal presentation types. In those cases, each group of four digits is separated by an underscore character in the output:

If you try to specify grouping_option with any presentation type other than those listed above, then your code will raise an exception.

The precision Component

The precision component specifies the number of digits after the decimal point for floating-point presentation types:

In these examples, you use different precision values to display the output number. The precision is separated from the width by a literal dot (.).

For string representation types, precision specifies the maximum width of the output:

If the input value is longer than the specified precision value, then the output will be truncated.

Creating Format Specifiers Dynamically

Inside a format specifier, you can nest pairs of curly braces ({}) that allow you to provide values using variables. That portion of the replacement field will be evaluated at runtime and replaced with the corresponding value:

Here, you’ve nested two pairs of curly braces in the string. The f-string version inserts the width and precision values directly in the nested braces. In the .format() version, the nested braces contain the 0 and 1 indices, which map to the first two arguments of the method.

You can also use keyword arguments in the call to .format(). The following example is functionally equivalent to the previous one:

As you can tell, this update of the .format() example is way more readable and descriptive than the one before.

Conclusion

Now, you know how to do string interpolation and formatting in Python. You’ve learned how to use modern Python tools like f-strings and the .format() method to effectively handle string interpolation and formatting.

In this tutorial, you’ve:

  • Used f-strings and the .format() method for string interpolation
  • Formatted the input values using different components of a replacement field
  • Created custom format specifiers to format your strings

With this knowledge, you’re ready to start interpolating and formatting values into your strings using f-strings and the .format() method with the appropriate format specifiers.

Take the Quiz: Test your knowledge with our interactive “A Guide to Modern Python String Formatting Tools” quiz. You’ll receive a score upon completion to help you track your learning progress:


A Guide to Modern Python String Formatting Tools

Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: Formatting Python Strings



Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button