Converting pyobjects to OCaml Types

With pyml_bindgen, you are generally want to set up a binding from a single Python class to a single OCaml module.

E.g.,

class Foo:
    def __init__(self, x)
        self.x = x

    def add1(self):
        self.x += 1

would have a corresponding module something like this

module Foo : sig
  type t

  val of_pyobject : Pytypes.pyobject -> t
  val to_pyobject : t -> Pytypes.pyobject

  val x : t -> int

  val add1 : t -> unit -> unit
end = struct
  type t = Pytypes.pyobject

  let of_pyobject x = x
  let to_pyobject x = x

  let add1 t () = 
end

In the above example, we don't bother checking the Python-land type of the pyobject. All OCaml compiler knows at compile time is that we are taking a Pytypes.pyobject type and getting back a Foo.t.

Depending on how that pyobject was actually created elsewhere in the code, it might not actually be an instance of the Foo class. In this case, when you go to call the Foo.say_hi function in your OCaml code, you will get a runtime error. Let me give you an example.

let i = Py.Int.of_int 1
assert 
let foo = Foo.of_pyobject i

let _ = Foo.say_hi foo ()
(* ERROR! *)

You'll get an exception: Exception: E (<class 'AttributeError'>, 'int' object has no attribute 'add1').

Checking pyobjects at module boundaries

While you could remove the of_pyobject function from the interface, you are often going to need it outside the module. For example, you may have a Python class Foo that has a method which returns an object of class Bar. In your OCaml code you'd need to call the Bar.of_pyobject method from inside the Foo module.

Basically, you would like to have an of_pyobject that actually checks that the underlying Python type is what the module expects. I.e., you only want to create a Foo.t if the pyobject is a Foo object in Python-land.

You can address this problem in the typical OCaml way (e.g., by returning t option or t Or_error.t instead of t) in pyml_bindgen as well. Let's see what I mean.

pyml_bindgen automatically generates of_pyobject and to_pyobject functions for you (in fact, you shouldn't provide those yourself).

You can generate three kinds of_pyobject function with pyml_bindgen:

  • No checking: val of_pyobject : Pytypes.pyobject -> t
  • option returning: val of_pyobject : Pytypes.pyobject -> t option
  • Base Or_error.t returning: val of_pyobject : Pytypes.pyobject -> t Or_error.t

You can choose between the three with the --of-pyo-ret-type option. Here is the section from the man page:

-r OF_PYO_RET_TYPE, --of-pyo-ret-type=OF_PYO_RET_TYPE (absent=option)
    Return type of the of_pyobject function. OF_PYO_RET_TYPE must be
    one of `no_check', `option' or `or_error'.

While the option and Or_error.t let you avoid a lot of potential runtime problems, they will force you to deal with potential errors each time of_pyobject is called, and in code generated by pyml_bindgen you may not realize that it is being called!

Say that you generated both Person and Job modules with the --of-pyo-ret-type=option command line option. Then both of these modules will have of_pyobject functions that return t option rather than just t.

Note: For now, you can only generate one of these module signatures at a time with pyml_bindgen. To combine them, you'll have to run it multiple times and then combine manually.

Here is an example of code that won't work.

module rec Person : sig
  type t
  val of_pyobject : Pytypes.pyobject -> t option
  (* Oops! *)
  val get_job : t -> unit -> Job.t
  ...
end = struct ... end

and Job : sig
  type t
  val of_pyobject : Pytypes.pyobject -> t option
  ...
end = struct ... end

When pyml_bindgen sees a function that ends in a custom type (a module type like Job.t, Person.t, or whatever), the generated code will call that type's of_pyobject function to convert it to the correct OCaml type. So, for Person.get_job it will generate a function that calls Job.of_pyobject somewhere in the get_job implementation. Of course, Job.of_pyobject returns Job.t option and not Job.t. But in the Person.get_job signature, we've specified that get_job returns Job.t and NOT Job.t option.

Now, pyml_bindgen will happily generate this implementation for you, but when you try to actually compile it, you will get an error about the return type of get_job implementation not matching the expected signature.

So what do you do? Well, you have to remember that the --of-pyo-ret-type=option and --of-pyo-ret-type=or_error flags will essentially poison all generated functions that manipulate other auto-generated modules.

Specifically, for this example, you can't write val get_job : t -> unit -> Job.t. Instead, you have to write val get_job : t -> unit -> Job.t option. Just so that it's clear, the reason is because Job.of_pyobject returns Job.t option, and the generated implementation of Person.get_job will call Job.of_pyobject somewhere in its body.)

Wrap-up

You have to be aware of the return types of the of_pyobject functions you're generating with pyml_bindgen. If you use option or Or_error.t, you have to remember to adjust your value specifications accordingly!