Custom Type Example¶
This is an example of using a custom type with PyMongo. The example
here is a bit contrived, but shows how to use a
SONManipulator
to manipulate
documents as they are saved or retrieved from MongoDB. More
specifically, it shows a couple different mechanisms for working with
custom datatypes in PyMongo.
Setup¶
We’ll start by getting a clean database to use for the example:
>>> from pymongo.mongo_client import MongoClient
>>> client = MongoClient()
>>> client.drop_database("custom_type_example")
>>> db = client.custom_type_example
Since the purpose of the example is to demonstrate working with custom
types, we’ll need a custom datatype to use. Here we define the aptly
named Custom
class, which has a single method, x()
:
>>> class Custom(object):
... def __init__(self, x):
... self.__x = x
...
... def x(self):
... return self.__x
...
>>> foo = Custom(10)
>>> foo.x()
10
When we try to save an instance of Custom
with PyMongo, we’ll
get an InvalidDocument
exception:
>>> db.test.insert({"custom": Custom(5)})
Traceback (most recent call last):
InvalidDocument: cannot convert value of type <class 'Custom'> to bson
Manual Encoding¶
One way to work around this is to manipulate our data into something
we can save with PyMongo. To do so we define two methods,
encode_custom()
and decode_custom()
:
>>> def encode_custom(custom):
... return {"_type": "custom", "x": custom.x()}
...
>>> def decode_custom(document):
... assert document["_type"] == "custom"
... return Custom(document["x"])
...
We can now manually encode and decode Custom
instances and
use them with PyMongo:
>>> db.test.insert({"custom": encode_custom(Custom(5))})
ObjectId('...')
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': {u'x': 5, u'_type': u'custom'}}
>>> decode_custom(db.test.find_one()["custom"])
<Custom object at ...>
>>> decode_custom(db.test.find_one()["custom"]).x()
5
Automatic Encoding and Decoding¶
Needless to say, that was a little unwieldy. Let’s make this a bit
more seamless by creating a new
SONManipulator
.
SONManipulator
instances allow you
to specify transformations to be applied automatically by PyMongo:
>>> from pymongo.son_manipulator import SONManipulator
>>> class Transform(SONManipulator):
... def transform_incoming(self, son, collection):
... for (key, value) in son.items():
... if isinstance(value, Custom):
... son[key] = encode_custom(value)
... elif isinstance(value, dict): # Make sure we recurse into sub-docs
... son[key] = self.transform_incoming(value, collection)
... return son
...
... def transform_outgoing(self, son, collection):
... for (key, value) in son.items():
... if isinstance(value, dict):
... if "_type" in value and value["_type"] == "custom":
... son[key] = decode_custom(value)
... else: # Again, make sure to recurse into sub-docs
... son[key] = self.transform_outgoing(value, collection)
... return son
...
Now we add our manipulator to the Database
:
>>> db.add_son_manipulator(Transform())
After doing so we can save and restore Custom
instances seamlessly:
>>> db.test.remove() # remove whatever has already been saved
{...}
>>> db.test.insert({"custom": Custom(5)})
ObjectId('...')
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': <Custom object at ...>}
>>> db.test.find_one()["custom"].x()
5
If we get a new Database
instance we’ll
clear out the SONManipulator
instance we added:
>>> db = client.custom_type_example
This allows us to see what was actually saved to the database:
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': {u'x': 5, u'_type': u'custom'}}
which is the same format that we encode to with our
encode_custom()
method!
Binary Encoding¶
We can take this one step further by encoding to binary, using a user
defined subtype. This allows us to identify what to decode without
resorting to tricks like the _type
field used above.
We’ll start by defining the methods to_binary()
and
from_binary()
, which convert Custom
instances to and
from Binary
instances:
Note
You could just pickle the instance and save that. What we do here is a little more lightweight.
>>> from bson.binary import Binary
>>> def to_binary(custom):
... return Binary(str(custom.x()), 128)
...
>>> def from_binary(binary):
... return Custom(int(binary))
...
Next we’ll create another
SONManipulator
, this time using the
methods we just defined:
>>> class TransformToBinary(SONManipulator):
... def transform_incoming(self, son, collection):
... for (key, value) in son.items():
... if isinstance(value, Custom):
... son[key] = to_binary(value)
... elif isinstance(value, dict):
... son[key] = self.transform_incoming(value, collection)
... return son
...
... def transform_outgoing(self, son, collection):
... for (key, value) in son.items():
... if isinstance(value, Binary) and value.subtype == 128:
... son[key] = from_binary(value)
... elif isinstance(value, dict):
... son[key] = self.transform_outgoing(value, collection)
... return son
...
Now we’ll empty the Database
and add the
new manipulator:
>>> db.test.remove()
{...}
>>> db.add_son_manipulator(TransformToBinary())
After doing so we can save and restore Custom
instances
seamlessly:
>>> db.test.insert({"custom": Custom(5)})
ObjectId('...')
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': <Custom object at ...>}
>>> db.test.find_one()["custom"].x()
5
We can see what’s actually being saved to the database (and verify
that it is using a Binary
instance) by
clearing out the manipulators and repeating our
find_one()
:
>>> db = client.custom_type_example
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': Binary('5', 128)}