Creating Custom Tag in Python PyYAML
Last Updated : 23 Jul, 2024
YAML, or YAML Markup Language is a data interchange format that is as readable as a text file, and one of the relations of JSON and XML. PyYAML is a YAML parser/ emitter library for Python that can handle parsing as well as the emission of YAML documents. Another nice feature of PyYAML is its ability to handle builtins, that lets you define new tags and work with more complex data in the YAML format.
What Are Custom Tags?
In YAML, the tags are indicators of the node data type. For instance, str stands for a string, and int denotes an integer. In PyYAML, there is a possibility to create the so-called Custom tags which allow for using custom tags to represent the more complex or even the domain-specific data types. This becomes especially useful when the configuration files or data formats use more than simple data types.
Why Use Custom Tags?
Custom tags are beneficial when you need to:
- Encode complex data structures.
- Represent domain-specific concepts.
- Clean the data representation to assist in decreasing clutter.
- The values should be kept coherent and sound.
Creating Custom Tag in Python PyYAML
To create and use custom tags in PyYAML, you need to follow these steps:
- Define the custom data structure.
- Create a Python class to represent the custom data structure.
- Implement the necessary logic to serialize and deserialize the custom data structure.
- Register the custom tag with PyYAML.
Step 1: Define the Custom Data Structure
Let's define a simple custom data structure for a point in a 2D space:
!!point
x: 10
y: 20
Step 2: Create a Python Class
Next, create a Python class to represent this custom data structure, Save this class in a file named point.py.
Python class Point: def __init__(self, x, y): self.x = x self.y = y def __repr__(self): return f"Point(x={self.x}, y={self.y})"
Step 3: Implement Serialization and Deserialization
Create a file named custom_tags.py to implement the logic for serialising and deserialising the custom data structure and to register the custom tag with PyYAML.
Python import yaml def point_representer(dumper, data): return dumper.represent_mapping('!point', {'x': data.x, 'y': data.y}) def point_constructor(loader, node): values = loader.construct_mapping(node) return Point(values['x'], values['y']) # Register the representer and constructor with PyYAML yaml.add_representer(Point, point_representer) yaml.add_constructor('!point', point_constructor)
Step 4: Using the Custom Tag
Now, you can use the custom tag in your YAML files and load them with PyYAML:
File: example.yaml
!point
x: 10
y: 20
Load the YAML data in main.py file
Python import yaml from custom_tags import Point # Load the YAML data with open('example.yaml', 'r') as file: point = yaml.load(file, Loader=yaml.FullLoader) print(point) # Dump the Point object back to YAML yaml_string = yaml.dump(point) print(yaml_string)
Step 5: Run the Python Script
Navigate to the my_project directory in your terminal and run the main.py script:
cd path/to/my_project
python main.py
output:
Point(x=10, y=20)
!!point
x: 10
y: 20
Advanced PyYAML Custom Tags
Let's consider a more advanced example where we define a custom tag for a 3D point:
File: 'point3d.py'
Python class Point3D: def __init__(self, x, y, z): self.x = x self.y = y self.z = z def __repr__(self): return f"Point3D(x={self.x}, y={self.y}, z={self.z})"
File: custom_tags.py
Add the following code to handle the 3D point:
Python import yaml from point import Point from point3d import Point3D # Existing Point serialization and deserialization def point_representer(dumper, data): return dumper.represent_mapping('!point', {'x': data.x, 'y': data.y}) def point_constructor(loader, node): values = loader.construct_mapping(node) return Point(values['x'], values['y']) # New Point3D serialization and deserialization def point3d_representer(dumper, data): return dumper.represent_mapping('!point3d', {'x': data.x, 'y': data.y, 'z': data.z}) def point3d_constructor(loader, node): values = loader.construct_mapping(node) return Point3D(values['x'], values['y'], values['z']}) # Register the representers and constructors with PyYAML yaml.add_representer(Point, point_representer) yaml.add_constructor('!point', point_constructor) yaml.add_representer(Point3D, point3d_representer) yaml.add_constructor('!point3d', point3d_constructor)
File: example3d.yaml
!point3d
x: 10
y: 20
z: 30
File: main3d.py
Python import yaml from custom_tags import Point3D # Load the YAML data with open('example3d.yaml', 'r') as file: point3d = yaml.load(file, Loader=yaml.FullLoader) print(point3d) # Output: Point3D(x=10, y=20, z=30) # Dump the Point3D object back to YAML yaml_string = yaml.dump(point3d) print(yaml_string)
Run the Python Script
Navigate to the my_project directory in your terminal and run the main.py script:
cd path/to/my_project
python main3d.py
output:
Point3D(x=10, y=20, z=30)
!!point3d
x: 10
y: 20
z: 30
Advanced Features
Here are some of the key advanced features:
1. Custom Constructors and Representers
Custom constructors and representors allow you to define how YAML nodes are converted to Python objects and vice versa. This feature is particularly useful for handling complex data structures or domain-specific objects.
Example: Custom Constructor and Representer for a Date
Python import yaml from datetime import datetime class CustomDate(datetime): pass def date_constructor(loader, node): value = loader.construct_scalar(node) return CustomDate.strptime(value, '%Y-%m-%d') def date_representer(dumper, data): value = data.strftime('%Y-%m-%d') return dumper.represent_scalar('!date', value) yaml.add_constructor('!date', date_constructor) yaml.add_representer(CustomDate, date_representer)
Usage:
Python # YAML data with custom date tag yaml_data = """ !date '2024-07-10' """ # Load the YAML data date_obj = yaml.load(yaml_data, Loader=yaml.FullLoader) print(date_obj) # Dump the date object back to YAML yaml_string = yaml.dump(date_obj) print(yaml_string)
Output:
2024-07-10 00:00:00
2. Custom Resolver
A custom resolver allows you to define how YAML tags are matched to Python types. This can be used to create more intuitive or concise YAML representations.
Example: Custom Resolver for Dates
Python def date_resolver(loader, node): return loader.construct_scalar(node) yaml.add_implicit_resolver('!date', date_resolver, ['\\d{4}-\\d{2}-\\d{2}'])
Usage:
Python # YAML data with implicit date recognition yaml_data = """ 2024-07-10 """ # Load the YAML data date_obj = yaml.load(yaml_data, Loader=yaml.FullLoader) print(date_obj)
Output:
2024-07-10
3. Multi-Document YAML
PyYAML supports multi-document YAML files which allows you to load and dump multiple documents to a single file.
Example: Multi-Document YAML
Python # Multi-document YAML data yaml_data = """ --- name: Document 1 value: 123 --- name: Document 2 value: 456 """ # Load multiple documents documents = list(yaml.load_all(yaml_data, Loader=yaml.FullLoader)) print(documents) # Dump multiple documents yaml_string = yaml.dump_all(documents) print(yaml_string)
Output :
[{'name': 'Document 1', 'value': 123}, {'name': 'Document 2', 'value': 456}]
Conclusion
Custom tags in PyYAML allow you to set up specific extensions of the YAML language and define new arbitrary structures and domains. Custom types can be defined in Python, and the serialization and deserialization logic required for YAML configurations can be provided by writing appropriate logic in these classes. That is why PyYAML can be considered as a flexible and stable solution for the configuration data management and interchange in Python-based software systems.
Similar Reads
Creating Your Own Python IDE in Python
In this article, we are able to embark on an adventure to create your personal Python Integrated Development Environment (IDE) the usage of Python itself, with the assistance of the PyQt library. What is Python IDE?Python IDEs provide a characteristic-rich environment for coding, debugging, and goin
3 min read
Create XML Documents using Python
Extensible Markup Language(XML), is a markup language that you can use to create your own tags. It was created by the World Wide Web Consortium (W3C) to overcome the limitations of HTML, which is the basis for all Web pages. XML is based on SGML - Standard Generalized Markup Language. It is used for
3 min read
How to customize and configure PyCharm
Python is a very popular language, mainly known for its easy syntax and extensive libraries. It is mostly used in Data Domains and Development Fields as well. For writing our code and executing it we use an environment, called IDE or Integrated Development Environment. Likewise one of the very popul
8 min read
Creating and Viewing HTML files with Python
Python is one of the most versatile programming languages. It emphasizes code readability with extensive use of white space. It comes with the support of a vast collection of libraries which serve for various purposes, making our programming experience smoother and enjoyable. Python programs are us
3 min read
PYGLET â Creating ZIP Location Object
In this article we will see how we can create a ZIP location object in PYGLET module in python. Pyglet is easy to use but powerful library for developing visually rich GUI applications like games, multimedia etc. A window is a "heavyweight" object occupying operating system resources. Windows may ap
3 min read
Pretty Printing XML in Python
XML stands for Extensible Markup Language and is used to encode a document that can be understandable by both humans and machines. Data stored in XML format is easy to understand and easy to modify. There are three main things that have to keep in mind when you are using XML â Simplicity, Generality
2 min read
How to Import BeautifulSoup in Python
Beautiful Soup is a Python library used for parsing HTML and XML documents. It provides a simple way to navigate, search, and modify the parse tree, making it valuable for web scraping tasks. In this article, we will explore how to import BeautifulSoup in Python. What is BeautifulSoup?BeautifulSoup
3 min read
How to Remove tags using BeautifulSoup in Python?
Prerequisite- Beautifulsoup module In this article, we are going to draft a python script that removes a tag from the tree and then completely destroys it and its contents. For this, decompose() method is used which comes built into the module. Syntax: Beautifulsoup.Tag.decompose() Tag.decompose() r
2 min read
How to create modules in Python 3 ?
Modules are simply python code having functions, classes, variables. Any python file with .py extension can be referenced as a module. Although there are some modules available through the python standard library which are installed through python installation, Other modules can be installed using t
4 min read
string attribute in BeautifulSoup - Python
string attribute is provided by Beautiful Soup which is a web scraping framework for Python. Web scraping is the process of extracting data from the website using automated tools to make the process faster. If a tag has only one child, and that child is a NavigableString, the child can be accessed u
1 min read