Dataclasses Avro Schema
If you are immerse in the data streaming world, probably you had faced the serialization problem. There are different techniques/frameworks to achieve this, for example Thrift, Protocol Buffers or Apache Avro.
Personally, I am using Avro serialization and I always had to came up with avro schemas based on desired payload keeping in mind fields specification and attributes. This is not a heavy task for simple uses cases, but when we have complex types, data relationships (nested schemas) or custom types the process gets a bit complicated. I asked myself, what if we can generate the avro schemas based on a python class? Most of the time the desired payload that we want get after deserialization is based on a Python class. The ending results was:
Dataclasses Avro Schema, Generate Avro Schemas from a Python class 😀
Let's see an example. Suppose that we want an avro schema that represents a User:
{ "type": "record", "name": "User", "fields" : [ {"name": "name", "type": "string"}, {"name": "age", "type": "int"}, {"name": "has_pets", "type": "boolean"}, {"name": "money", "type": "float"} ], "doc": "User(name: str, age: int, has_pets: bool, money: float)" }
Instead of remember all fields specifications, we can write the python class to get the schema:
from dataclasses_avroschema.schema_generator import SchemaGenerator class User: name: str age: int has_pets: bool money: float SchemaGenerator(User).avro_schema() { "type": "record", "name": "User", "fields": [ {"name": "name", "type": "string"}, {"name": "age", "type": "int"}, {"name": "has_pets", "type": "boolean"}, {"name": "money", "type": "float"} ], "doc": "User(name: str, age: int, has_pets: bool, money: float)" }'
Super simple and straightforward. We have all this features:
- Primitive types: int, long, float, boolean, string and null support
- Complex types: enum, array, map, fixed, unions and records support
- Logical Types: date, time, datetime, uuid support
- Schema relations (oneToOne, oneToMany)
- Recursive Schemas
- Generate Avro Schemas from faust.Record
So, if you need an avro schema, give a chance to dataclasses-avroschema 😉
Comments
Comments powered by Disqus