Need help with mashumaro?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

Fatal1ty
360 Stars 24 Forks Apache License 2.0 496 Commits 3 Opened issues

Description

Fast and well tested serialization framework on top of dataclasses

Services available

!
?

Need anything else?

Contributors list

# 171,779
Python
python3
type-hi...
datacla...
458 commits
# 58,935
vimeo
python3
type-hi...
HTML
2 commits
# 99,993
Flask
python-...
python-...
singlet...
1 commit
# 70,490
backup-...
Haskell
k8s
Docker
1 commit
# 109,214
C
Shell
slack
data-mo...
1 commit
# 31,937
nix
webfram...
osx
macos-s...
1 commit

mashumaro (マシュマロ)

mashumaro is a fast and well tested serialization framework on top of dataclasses.

Build Status Coverage Status Latest Version Python Version License

When using dataclasses, you often need to dump and load objects according to the described scheme. This framework not only adds this ability to serialize in different formats, but also makes serialization rapidly.

Table of contents

Installation

Use pip to install:

shell
$ pip install mashumaro

Supported serialization formats

This framework adds methods for dumping to and loading from the following formats:

Plain dict can be useful when you need to pass a dict object to a third-party library, such as a client for MongoDB.

Supported field types

There is support for generic types from the standard

typing
module: *
List
*
Tuple
*
NamedTuple
*
Set
*
FrozenSet
*
Deque
*
Dict
*
OrderedDict
*
TypedDict
*
Mapping
*
MutableMapping
*
Counter
*
ChainMap
*
Sequence

for standard generic types on PEP 585 compatible Python (3.9+): *

list
*
tuple
*
namedtuple
*
set
*
frozenset
*
collections.abc.Set
*
collections.abc.MutableSet
*
collections.deque
*
dict
*
collections.OrderedDict
*
collections.abc.Mapping
*
collections.abc.MutableMapping
*
collections.Counter
*
collections.ChainMap
*
collections.abc.Sequence
*
collections.abc.MutableSequence

for special primitives from the

typing
module: *
Any
*
Optional
*
Union
*
TypeVar

for standard interpreter types from

types
module: *
NoneType
*
UnionType

for enumerations based on classes from the standard

enum
module: *
Enum
*
IntEnum
*
Flag
*
IntFlag

for common built-in types: *

int
*
float
*
bool
*
str
*
bytes
*
bytearray

for built-in datetime oriented types (see more details): *

datetime
*
date
*
time
*
timedelta
*
timezone

for pathlike types: *

PurePath
*
Path
*
PurePosixPath
*
PosixPath
*
PureWindowsPath
*
WindowsPath
*
os.PathLike

for other less popular built-in types: *

uuid.UUID
*
decimal.Decimal
*
fractions.Fraction
*
ipaddress.IPv4Address
*
ipaddress.IPv6Address
*
ipaddress.IPv4Network
*
ipaddress.IPv6Network
*
ipaddress.IPv4Interface
*
ipaddress.IPv6Interface

for arbitrary types: * user-defined classes * user-defined generic types

Usage example

from enum import Enum
from typing import List
from dataclasses import dataclass
from mashumaro import DataClassJSONMixin

class Currency(Enum): USD = "USD" EUR = "EUR"

@dataclass class CurrencyPosition(DataClassJSONMixin): currency: Currency balance: float

@dataclass class StockPosition(DataClassJSONMixin): ticker: str name: str balance: int

@dataclass class Portfolio(DataClassJSONMixin): currencies: List[CurrencyPosition] stocks: List[StockPosition]

my_portfolio = Portfolio( currencies=[ CurrencyPosition(Currency.USD, 238.67), CurrencyPosition(Currency.EUR, 361.84), ], stocks=[ StockPosition("AAPL", "Apple", 10), StockPosition("AMZN", "Amazon", 10), ] )

json_string = my_portfolio.to_json() Portfolio.from_json(json_string) # same as my_portfolio

How does it work?

This framework works by taking the schema of the data and generating a specific parser and builder for exactly that schema. This is much faster than inspection of field types on every call of parsing or building at runtime.

Benchmark

  • macOS 11.5.2 Big Sur
  • Apple M1
  • 16GB RAM
  • Python 3.9.1

Load and dump sample data 1.000 times in 5 runs. The following figures show the best overall time in each case.

Framework From dict To dict
Time Slowdown factor Time Slowdown factor
mashumaro 0.04096 1x 0.02741 1x
cattrs 0.07307 1.78x 0.05062 1.85x
pydantic 0.24847 6.07x 0.12292 4.48x
marshmallow 0.29205 7.13x 0.09310 3.4x
dataclasses 0.22583 8.24x
dacite 0.91553 22.35x

To run benchmark in your environment:

bash
git clone [email protected]:Fatal1ty/mashumaro.git
cd mashumaro
python3 -m venv env && source env/bin/activate
pip install -e .
pip install -r requirements-dev.txt
python benchmark/run.py

API

Mashumaro provides a couple of mixins for each format.

DataClassDictMixin.to_dict(use_bytes: bool, use_enum: bool, use_datetime: bool)

Make a dictionary from dataclass object based on the dataclass schema provided. Options include:

python
use_bytes: False     # False - convert bytes/bytearray objects to base64 encoded string, True - keep untouched
use_enum: False      # False - convert enum objects to enum values, True - keep untouched
use_datetime: False  # False - convert datetime oriented objects to ISO 8601 formatted string, True - keep untouched

DataClassDictMixin.from_dict(data: Mapping, use_bytes: bool, use_enum: bool, use_datetime: bool)

Make a new object from dict object based on the dataclass schema provided. Options include:

python
use_bytes: False     # False - load bytes/bytearray objects from base64 encoded string, True - keep untouched
use_enum: False      # False - load enum objects from enum values, True - keep untouched
use_datetime: False  # False - load datetime oriented objects from ISO 8601 formatted string, True - keep untouched

DataClassJSONMixin.to_json(encoder: Optional[Encoder], dict_params: Optional[Mapping], **encoder_kwargs)

Make a JSON formatted string from dataclass object based on the dataclass schema provided. Options include:

encoder        # function called for json encoding, defaults to json.dumps
dict_params    # dictionary of parameter values passed underhood to `to_dict` function
encoder_kwargs # keyword arguments for encoder function

DataClassJSONMixin.from_json(data: Union[str, bytes, bytearray], decoder: Optional[Decoder], dict_params: Optional[Mapping], **decoder_kwargs)

Make a new object from JSON formatted string based on the dataclass schema provided. Options include:

decoder        # function called for json decoding, defaults to json.loads
dict_params    # dictionary of parameter values passed underhood to `from_dict` function
decoder_kwargs # keyword arguments for decoder function

DataClassMessagePackMixin.to_msgpack(encoder: Optional[Encoder], dict_params: Optional[Mapping], **encoder_kwargs)

Make a MessagePack formatted bytes object from dataclass object based on the dataclass schema provided. Options include:

encoder        # function called for MessagePack encoding, defaults to msgpack.packb
dict_params    # dictionary of parameter values passed underhood to `to_dict` function
encoder_kwargs # keyword arguments for encoder function

DataClassMessagePackMixin.from_msgpack(data: Union[str, bytes, bytearray], decoder: Optional[Decoder], dict_params: Optional[Mapping], **decoder_kwargs)

Make a new object from MessagePack formatted data based on the dataclass schema provided. Options include:

decoder        # function called for MessagePack decoding, defaults to msgpack.unpackb
dict_params    # dictionary of parameter values passed underhood to `from_dict` function
decoder_kwargs # keyword arguments for decoder function

DataClassYAMLMixin.to_yaml(encoder: Optional[Encoder], dict_params: Optional[Mapping], **encoder_kwargs)

Make an YAML formatted bytes object from dataclass object based on the dataclass schema provided. Options include:

encoder        # function called for YAML encoding, defaults to yaml.dump
dict_params    # dictionary of parameter values passed underhood to `to_dict` function
encoder_kwargs # keyword arguments for encoder function

DataClassYAMLMixin.from_yaml(data: Union[str, bytes], decoder: Optional[Decoder], dict_params: Optional[Mapping], **decoder_kwargs)

Make a new object from YAML formatted data based on the dataclass schema provided. Options include:

decoder        # function called for YAML decoding, defaults to yaml.safe_load
dict_params    # dictionary of parameter values passed underhood to `from_dict` function
decoder_kwargs # keyword arguments for decoder function

Customization

SerializableType interface

If you already have a separate custom class, and you want to serialize instances of it with mashumaro, you can achieve this by implementing

SerializableType
interface:
from typing import Dict
from datetime import datetime
from dataclasses import dataclass
from mashumaro import DataClassDictMixin
from mashumaro.types import SerializableType

class DateTime(datetime, SerializableType): def _serialize(self) -> Dict[str, int]: return { "year": self.year, "month": self.month, "day": self.day, "hour": self.hour, "minute": self.minute, "second": self.second, }

@classmethod
def _deserialize(cls, value: Dict[str, int]) -> 'DateTime':
    return DateTime(
        year=value['year'],
        month=value['month'],
        day=value['day'],
        hour=value['hour'],
        minute=value['minute'],
        second=value['second'],
    )

@dataclass class Holiday(DataClassDictMixin): when: DateTime = DateTime.now()

new_year = Holiday(when=DateTime(2019, 1, 1, 12)) dictionary = new_year.to_dict()

{'x': {'year': 2019, 'month': 1, 'day': 1, 'hour': 0, 'minute': 0, 'second': 0}}

assert Holiday.from_dict(dictionary) == new_year

If you have a custom generic type and are looking for a generic version of such an interface, read this.

Field options

In some cases creating a new class just for one little thing could be excessive. Moreover, you may need to deal with third party classes that you are not allowed to change. You can use

dataclasses.field
function as a default field value to configure some serialization aspects through its

metadata
parameter. Next section describes all supported options to use in
metadata
mapping.

serialize
option

This option allows you to change the serialization method. When using this option, the serialization behaviour depends on what type of value the option has. It could be either

Callable[[Any], Any]
or
str
.

A value of type

Callable[[Any], Any]
is a generic way to specify any callable object like a function, a class method, a class instance method, an instance of a callable class or even a lambda function to be called for serialization.

A value of type

str
sets a specific engine for serialization. Keep in mind that all possible engines depend on the field type that this option is used with. At this moment there are next serialization engines to choose from:

| Applicable field types | Supported engines | Description |:-------------------------- |:-------------------------|:------------------------------| |

NamedTuple
,
namedtuple
|
as_list
,
as_dict
| How to pack named tuples. By default
as_list
engine is used that means your named tuple class instance will be packed into a list of its values. You can pack it into a dictionary using
as_dict
engine.

Example:

from datetime import datetime
from dataclasses import dataclass, field
from typing import NamedTuple
from mashumaro import DataClassDictMixin

class MyNamedTuple(NamedTuple): x: int y: float

@dataclass class A(DataClassDictMixin): dt: datetime = field( metadata={ "serialize": lambda v: v.strftime('%Y-%m-%d %H:%M:%S') } ) t: MyNamedTuple = field(metadata={"serialize": "as_dict"})

deserialize
option

This option allows you to change the deserialization method. When using this option, the deserialization behaviour depends on what type of value the option has. It could be either

Callable[[Any], Any]
or
str
.

A value of type

Callable[[Any], Any]
is a generic way to specify any callable object like a function, a class method, a class instance method, an instance of a callable class or even a lambda function to be called for deserialization.

A value of type

str
sets a specific engine for deserialization. Keep in mind that all possible engines depend on the field type that this option is used with. At this moment there are next deserialization engines to choose from:

| Applicable field types | Supported engines | Description |:-------------------------- |:-------------------------|:------------------------------| |

datetime
,
date
,
time
|
ciso8601
,
pendulum
| How to parse datetime string. By default native
fromisoformat
of corresponding class will be used for
datetime
,
date
and
time
fields. It's the fastest way in most cases, but you can choose an alternative. | |
NamedTuple
,
namedtuple
|
as_list
,
as_dict
| How to unpack named tuples. By default
as_list
engine is used that means your named tuple class instance will be created from a list of its values. You can unpack it from a dictionary using
as_dict
engine.

Example:

from datetime import datetime
from dataclasses import dataclass, field
from typing import List, NamedTuple
from mashumaro import DataClassDictMixin
import ciso8601
import dateutil

class MyNamedTuple(NamedTuple): x: int y: float

@dataclass class A(DataClassDictMixin): x: datetime = field( metadata={"deserialize": "pendulum"} )

class B(DataClassDictMixin): x: datetime = field( metadata={"deserialize": ciso8601.parse_datetime_as_naive} )

@dataclass class C(DataClassDictMixin): dt: List[datetime] = field( metadata={ "deserialize": lambda l: list(map(dateutil.parser.isoparse, l)) } )

@dataclass class D(DataClassDictMixin): x: MyNamedTuple = field(metadata={"deserialize": "as_dict"})

serialization_strategy
option

This option is useful when you want to change the serialization behaviour for a class depending on some defined parameters. For this case you can create the special class implementing

SerializationStrategy
interface:
from dataclasses import dataclass, field
from datetime import datetime
from mashumaro import DataClassDictMixin
from mashumaro.types import SerializationStrategy

class FormattedDateTime(SerializationStrategy): def init(self, fmt): self.fmt = fmt

def serialize(self, value: datetime) -> str:
    return value.strftime(self.fmt)

def deserialize(self, value: str) -> datetime:
    return datetime.strptime(value, self.fmt)

@dataclass class DateTimeFormats(DataClassDictMixin): short: datetime = field( metadata={ "serialization_strategy": FormattedDateTime( fmt="%d%m%Y%H%M%S", ) } ) verbose: datetime = field( metadata={ "serialization_strategy": FormattedDateTime( fmt="%A %B %d, %Y, %H:%M:%S", ) } )

formats = DateTimeFormats( short=datetime(2019, 1, 1, 12), verbose=datetime(2019, 1, 1, 12), ) dictionary = formats.to_dict()

{'short': '01012019120000', 'verbose': 'Tuesday January 01, 2019, 12:00:00'}

assert DateTimeFormats.from_dict(dictionary) == formats

alias
option

In some cases it's better to have different names for a field in your class and in its serialized view. For example, a third-party legacy API you are working with might operate with camel case style, but you stick to snake case style in your code base. Or even you want to load data with keys that are invalid identifiers in Python. This problem is easily solved by using aliases:

from dataclasses import dataclass, field
from mashumaro import DataClassDictMixin, field_options

@dataclass class DataClass(DataClassDictMixin): a: int = field(metadata=field_options(alias="FieldA")) b: int = field(metadata=field_options(alias="#invalid"))

x = DataClass.from_dict({"FieldA": 1, "#invalid": 2}) # DataClass(a=1, b=2) x.to_dict() # {"a": 1, "b": 2} # no aliases on serialization by default

If you want to write all the field aliases in one place there is such a config option.

If you want to serialize all the fields by aliases you have two options to do so: *

serialize_by_alias
config option *
by_alias
keyword argument in
to_dict
method

It's hard to imagine when it might be necessary to serialize only specific fields by alias, but such functionality is easily added to the library. Open the issue if you need it.

If you don't want to remember the names of the options you can use

field_options
helper function:
from dataclasses import dataclass, field
from mashumaro import DataClassDictMixin, field_options

@dataclass class A(DataClassDictMixin): x: int = field( metadata=field_options( serialize=str, deserialize=int, ... ) )

More options are on the way. If you know which option would be useful for many, please don't hesitate to create an issue or pull request.

Config options

If inheritance is not an empty word for you, you'll fall in love with the

Config
class. You can register
serialize
and
deserialize
methods, define code generation options and other things just in one place. Or in some classes in different ways if you need flexibility. Inheritance is always on the first place.

There is a base class

BaseConfig
that you can inherit for the sake of convenience, but it's not mandatory.

In the following example you can see how the

debug
flag is changed from class to class:
ModelA
will have debug mode enabled but
ModelB
will not.
from mashumaro import DataClassDictMixin
from mashumaro.config import BaseConfig

class BaseModel(DataClassDictMixin): class Config(BaseConfig): debug = True

class ModelA(BaseModel): a: int

class ModelB(BaseModel): b: int

class Config(BaseConfig):
    debug = False

Next section describes all supported options to use in the config.

debug
config option

If you enable the

debug
option the generated code for your data class will be printed.

code_generation_options
config option

Some users may need functionality that wouldn't exist without extra cost such as valuable cpu time to execute additional instructions. Since not everyone needs such instructions, they can be enabled by a constant in the list, so the fastest basic behavior of the library will always remain by default. The following table provides a brief overview of all the available constants described below.

| Constant | Description |:--------------------------------------------------------------- |:---------------------------------------------------------------------------| |

TO_DICT_ADD_OMIT_NONE_FLAG
| Adds

omit_none
keyword-only argument to
to_dict
method. | |
TO_DICT_ADD_BY_ALIAS_FLAG
| Adds
by_alias
keyword-only argument to
to_dict
method. | |
ADD_DIALECT_SUPPORT
| Adds
dialect
keyword-only argument to
from_dict
and
to_dict
methods. |

serialization_strategy
config option

You can register custom

SerializationStrategy
,
serialize
and
deserialize
methods for specific types just in one place. It could be configured using a dictionary with types as keys. The value could be either a
SerializationStrategy
instance or a dictionary with
serialize
and
deserialize
values with the same meaning as in the field options.
from dataclasses import dataclass
from datetime import datetime, date
from mashumaro import DataClassDictMixin
from mashumaro.config import BaseConfig
from mashumaro.types import SerializationStrategy

class FormattedDateTime(SerializationStrategy): def init(self, fmt): self.fmt = fmt

def serialize(self, value: datetime) -> str:
    return value.strftime(self.fmt)

def deserialize(self, value: str) -> datetime:
    return datetime.strptime(value, self.fmt)

@dataclass class DataClass(DataClassDictMixin):

datetime: datetime
date: date

class Config(BaseConfig):
    serialization_strategy = {
        datetime: FormattedDateTime("%Y"),
        date: {
            # you can use specific str values for datetime here as well
            "deserialize": "pendulum",
            "serialize": date.isoformat,
        },
    }

instance = DataClass.from_dict({"datetime": "2021", "date": "2021"})

DataClass(datetime=datetime.datetime(2021, 1, 1, 0, 0), date=Date(2021, 1, 1))

dictionary = instance.to_dict()

{'datetime': '2021', 'date': '2021-01-01'}

aliases
config option

Sometimes it's better to write the field aliases in one place. You can mix aliases here with aliases in the field options, but the last ones will always take precedence.

from dataclasses import dataclass
from mashumaro import DataClassDictMixin
from mashumaro.config import BaseConfig

@dataclass class DataClass(DataClassDictMixin): a: int b: int

class Config(BaseConfig):
    aliases = {
        "a": "FieldA",
        "b": "FieldB",
    }

DataClass.from_dict({"FieldA": 1, "FieldB": 2}) # DataClass(a=1, b=2)

serialize_by_alias
config option

All the fields with aliases will be serialized by them by default when this option is enabled. You can mix this config option with

by_alias
keyword argument.

from dataclasses import dataclass, field
from mashumaro import DataClassDictMixin, field_options
from mashumaro.config import BaseConfig

@dataclass class DataClass(DataClassDictMixin): field_a: int = field(metadata=field_options(alias="FieldA"))

class Config(BaseConfig):
    serialize_by_alias = True

DataClass(field_a=1).to_dict() # {'FieldA': 1}

namedtuple_as_dict
config option

Dataclasses are a great way to declare and use data models. But it's not the only way. Python has a typed version of namedtuple called NamedTuple which looks similar to dataclasses:

from typing import NamedTuple

class Point(NamedTuple): x: int y: int

the same with a dataclass will look like this:

from dataclasses import dataclass

@dataclass class Point: x: int y: int

At first glance, you can use both options. But imagine that you need to create a bunch of instances of the

Point
class. Due to how dataclasses work you will have more memory consumption compared to named tuples. In such a case it could be more appropriate to use named tuples.

By default, all named tuples are packed into lists. But with

namedtuple_as_dict
option you have a drop-in replacement for dataclasses:
from dataclasses import dataclass
from typing import List, NamedTuple
from mashumaro import DataClassDictMixin

class Point(NamedTuple): x: int y: int

@dataclass class DataClass(DataClassDictMixin): points: List[Point]

class Config:
    namedtuple_as_dict = True

obj = DataClass.from_dict({"points": [{"x": 0, "y": 0}, {"x": 1, "y": 1}]}) print(obj.to_dict()) # {"points": [{"x": 0, "y": 0}, {"x": 1, "y": 1}]}

If you want to serialize only certain named tuple fields as dictionaries, you can use the corresponding serialization and deserialization engines.

allow_postponed_evaluation
config option

PEP 563 solved the problem of forward references by postponing the evaluation of annotations, so you can write the following code:

from __future__ import annotations
from dataclasses import dataclass
from mashumaro import DataClassDictMixin

@dataclass class A(DataClassDictMixin): x: B

@dataclass class B(DataClassDictMixin): y: int

obj = A.from_dict({'x': {'y': 1}})

You don't need to write anything special here, forward references work out of the box. If a field of a dataclass has a forward reference in the type annotations, building of

from_dict
and
to_dict
methods of this dataclass will be postponed until they are called once. However, if for some reason you don't want the evaluation to be possibly postponed, you can disable it using
allow_postponed_evaluation
option:
from __future__ import annotations
from dataclasses import dataclass
from mashumaro import DataClassDictMixin

@dataclass class A(DataClassDictMixin): x: B

class Config:
    allow_postponed_evaluation = False

UnresolvedTypeReferenceError: Class A has unresolved type reference B

in some of its fields

@dataclass class B(DataClassDictMixin): y: int

In this case you will get

UnresolvedTypeReferenceError
regardless of whether class B is declared below or not.

dialect
config option

This option is described below in the Dialects section.

Dialects

Sometimes it's needed to have different serialization and deserialization methods depending on the data source where entities of the dataclass are stored or on the API to which the entities are being sent or received from. There is a special Dialect type that may contain all the differences from the default serialization and deserialization methods. You can create different dialects and use each of them for the same dataclass depending on the situation.

Suppose we have the following dataclass with a field of type

date
:
python
@dataclass
class Entity(DataClassDictMixin):
    dt: date

By default, a field of

date
type serializes to a string in ISO 8601 format, so the serialized entity will look like
{'dt': '2021-12-31'}
. But what if we have, for example, two sensitive legacy Ethiopian and Japanese APIs that use two different formats for dates —
dd/mm/yyyy
and
yyyy年mm月dd日
? Instead of creating two similar dataclasses we can have one dataclass and two dialects: ```python from dataclasses import dataclass from datetime import date, datetime from mashumaro import DataClassDictMixin from mashumaro.config import ADDDIALECTSUPPORT from mashumaro.dialect import Dialect from mashumaro.types import SerializationStrategy

class DateTimeSerializationStrategy(SerializationStrategy): def init(self, fmt: str): self.fmt = fmt

def serialize(self, value: date) -> str:
    return value.strftime(self.fmt)

def deserialize(self, value: str) -> date: return datetime.strptime(value, self.fmt).date()

class EthiopianDialect(Dialect): serialization_strategy = { date: DateTimeSerializationStrategy("%d/%m/%Y") }

class JapaneseDialect(Dialect): serialization_strategy = { date: DateTimeSerializationStrategy("%Y年%m月%d日") }

@dataclass class Entity(DataClassDictMixin): dt: date

class Config:
    code_generation_options = [ADD_DIALECT_SUPPORT]

entity = Entity(date(2021, 12, 31)) entity.todict(dialect=EthiopianDialect) # {'dt': '31/12/2021'} entity.todict(dialect=JapaneseDialect) # {'dt': '2021年12月31日'} Entity.from_dict({'dt': '2021年12月31日'}, dialect=JapaneseDialect) ```

serialization_strategy
dialect option

This dialect option has the same meaning as the similar config option but for the dialect scope. You can register custom

SerializationStrategy
,
serialize
and
deserialize
methods for specific types.

Changing the default dialect

You can change the default serialization and deserialization methods for a dataclass not only in the

serialization_strategy
config option but using the

dialect
config option. If you have multiple dataclasses without a common parent class the default dialect can help you to reduce the number of code lines written:
@dataclass
class Entity(DataClassDictMixin):
    dt: date

class Config:
    dialect = JapaneseDialect

entity = Entity(date(2021, 12, 31)) entity.to_dict() # {'dt': '2021年12月31日'} assert Entity.from_dict({'dt': '2021年12月31日'}) == entity

Code generation options

Add
omit_none
keyword argument

If you want to have control over whether to skip

None
values on serialization you can add
omit_none
parameter to
to_dict
method using the
code_generation_options
list:
from dataclasses import dataclass
from mashumaro import DataClassDictMixin
from mashumaro.config import BaseConfig, TO_DICT_ADD_OMIT_NONE_FLAG

@dataclass class Inner(DataClassDictMixin): x: int = None # "x" won't be omitted since there is no TO_DICT_ADD_OMIT_NONE_FLAG here

@dataclass class Model(DataClassDictMixin): x: Inner a: int = None b: str = None # will be omitted

class Config(BaseConfig):
    code_generation_options = [TO_DICT_ADD_OMIT_NONE_FLAG]

Model(x=Inner(), a=1).to_dict(omit_none=True) # {'x': {'x': None}, 'a': 1}

Add
by_alias
keyword argument

If you want to have control over whether to serialize fields by their aliases you can add

by_alias
parameter to
to_dict
method using the
code_generation_options
list. The default value of
by_alias
parameter depends on whether the
serialize_by_alias
config option is enabled.
from dataclasses import dataclass, field
from mashumaro import DataClassDictMixin, field_options
from mashumaro.config import BaseConfig, TO_DICT_ADD_BY_ALIAS_FLAG

@dataclass class DataClass(DataClassDictMixin): field_a: int = field(metadata=field_options(alias="FieldA"))

class Config(BaseConfig):
    code_generation_options = [TO_DICT_ADD_BY_ALIAS_FLAG]

DataClass(field_a=1).to_dict() # {'field_a': 1} DataClass(field_a=1).to_dict(by_alias=True) # {'FieldA': 1}

Keep in mind, if you're serializing data in JSON or another format, then you need to pass

by_alias
argument to
dict_params
dictionary.

Add
dialect
keyword argument

Support for dialects is disabled by default for performance reasons. You can enable it using a

ADD_DIALECT_SUPPORT
constant: ```python from dataclasses import dataclass from datetime import date from mashumaro import DataClassDictMixin from mashumaro.config import BaseConfig, ADDDIALECTSUPPORT

@dataclass class Entity(DataClassDictMixin): dt: date

class Config(BaseConfig):
    code_generation_options = [ADD_DIALECT_SUPPORT]
### User-defined generic types

There is support for user-defined generic types. You can inherit generic dataclasses along with overwriting types in them, use generic dataclasses as field types, or create your own generic types with serialization under your control.

User-defined generic dataclasses

If you have a generic version of a dataclass and want to serialize and deserialize its instances depending on the concrete types, you can achieve this using inheritance:

```python from dataclasses import dataclass from datetime import date from typing import Generic, Mapping, TypeVar from mashumaro import DataClassDictMixin

KT = TypeVar("KT") VT = TypeVar("VT", date, str)

@dataclass class GenericDataClass(Generic[KT, VT]): x: Mapping[KT, VT]

@dataclass class ConcreteDataClass(GenericDataClass[str, date], DataClassDictMixin): pass

ConcreteDataClass.from_dict({"x": {"a": "2021-01-01"}}) # ok ConcreteDataClass.from_dict({"x": {"a": "not a date but str"}}) # error

You can override

TypeVar
field with a concrete type or another
TypeVar
. Partial specification of concrete types is also allowed. If a generic dataclass is inherited without type overriding the types of its fields remain untouched.

Generic dataclasses as field types

Another approach is to specify concrete types in the field type hints. This can help to have different versions of the same generic dataclass:

from dataclasses import dataclass
from datetime import date
from typing import Generic, TypeVar
from mashumaro import DataClassDictMixin

T = TypeVar('T')

@dataclass class GenericDataClass(Generic[T], DataClassDictMixin): x: T

@dataclass class DataClass(DataClassDictMixin): date: GenericDataClass[date] str: GenericDataClass[str]

instance = DataClass( date=GenericDataClass(x=date(2021, 1, 1)), str=GenericDataClass(x='2021-01-01'), ) dictionary = {'date': {'x': '2021-01-01'}, 'str': {'x': '2021-01-01'}} assert DataClass.from_dict(dictionary) == instance

GenericSerializableType interface

There is a generic alternative to

SerializableType
called

GenericSerializableType
. It makes it possible to serialize and deserialize instances of generic types depending on the types provided:
from typing import Dict, TypeVar, Iterator
from datetime import datetime
from dataclasses import dataclass
from mashumaro import DataClassDictMixin
from mashumaro.types import GenericSerializableType

KT = TypeVar("KT", int, str) VT = TypeVar("VT", int, str)

class GenericDict(Dict[KT, VT], GenericSerializableType): def _serialize(self, types) -> Dict[KT, VT]: k_type, v_type = types if k_type not in (int, str) or v_type not in (int, str): raise TypeError return {k_type(k): v_type(v) for k, v in self.items()}

@classmethod
def _deserialize(cls, value, types) -> 'GenericDict[KT, VT]':
    k_type, v_type = types
    if k_type not in (int, str) or v_type not in (int, str):
        raise TypeError
    return cls({k_type(k): v_type(v) for k, v in value.items()})

@dataclass class DataClass(DataClassDictMixin): x: GenericDict[int, str] y: GenericDict[str, int]

instance = DataClass(GenericDict({1: 'a'}), GenericDict({'b': 2})) dictionary = instance.to_dict() # {'x': {1: 'a'}, 'y': {'b': 2}} assert DataClass.from_dict(dictionary) == instance

The difference between

SerializableType
and
GenericSerializableType
is that the methods of
GenericSerializableType
have a parameter

types
, to which the concrete types will be passed. If you don't need this information you can still use
SerializableType
interface even with generic types.

Serialization hooks

In some cases you need to prepare input / output data or do some extraordinary actions at different stages of the deserialization / serialization lifecycle. You can do this with different types of hooks.

Before deserialization

For doing something with a dictionary that will be passed to deserialization you can use

__pre_deserialize__
class method:
@dataclass
class A(DataClassJSONMixin):
    abc: int

@classmethod
def __pre_deserialize__(cls, d: Dict[Any, Any]) -> Dict[Any, Any]:
    return {k.lower(): v for k, v in d.items()}

print(DataClass.from_dict({"ABC": 123})) # DataClass(abc=123) print(DataClass.from_json('{"ABC": 123}')) # DataClass(abc=123)

After deserialization

For doing something with a dataclass instance that was created as a result of deserialization you can use

__post_deserialize__
class method:
@dataclass
class A(DataClassJSONMixin):
    abc: int

@classmethod
def __post_deserialize__(cls, obj: 'A') -> 'A':
    obj.abc = 456
    return obj

print(DataClass.from_dict({"abc": 123})) # DataClass(abc=456) print(DataClass.from_json('{"abc": 123}')) # DataClass(abc=456)

Before serialization

For doing something before serialization you can use

__pre_serialize__
method:
@dataclass
class A(DataClassJSONMixin):
    abc: int
    counter: ClassVar[int] = 0

def __pre_serialize__(self) -> 'A':
    self.counter += 1
    return self

obj = DataClass(abc=123) obj.to_dict() obj.to_json() print(obj.counter) # 2

After serialization

For doing something with a dictionary that was created as a result of serialization you can use

__post_serialize__
method:
@dataclass
class A(DataClassJSONMixin):
    user: str
    password: str

def __post_serialize__(self, d: Dict[Any, Any]) -> Dict[Any, Any]:
    d.pop('password')
    return d

obj = DataClass(user="name", password="secret") print(obj.to_dict()) # {"user": "name"} print(obj.to_json()) # '{"user": "name"}'

TODO

  • add optional validation
  • write custom useful types such as URL, Email etc

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.